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MODIFIED PROTEINS AND METHODS OF THEIR 

PRODUCTION 

5 

CROSS-REFERENCE TO RELATED APPLICATION 

This Application is a Continuation-ln-Part Application of co- 
pending Application Serial No. 08/496,247, Filed June 28, 1995 which is 
10 a Continuation-ln-Part Application of co-pending Application Serial No. 

08/146,885, filed November 3 f 1993 which is a Continuation-ln-Part of 
co-pending Application Serial No. 08/004,1 39 r filed December 9 ( 1992. 
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BACKGROUND OF THE INVENTION 



The present invention is directed to modified proteins and methods 
of producing the same. More specifically, the modified protein of the 
present invention comprises a target protein and a controllable 
intervening protein sequence (CIVPS), the CIVPS being capable of 
20 excision or cleavage under predetermined conditions. 

Production of mature proteins involves the flow of information from 
DNA to RNA to protein. Precise excision of DNA and RNA elements 
which interrupt that information has been previously described (M. 

25 Belfort, Annu. Rev. Genet 24:363 (1990); T.R. Cech, Annu. Rev. 

Biochem. 59:543 (1990); Hunter et al., Genes Dev. 3:2101 (1989)). More 
recently, evidence for the precise excision of intervening protein 
sequences has also been described for the TFPI allele from 
Saccharomyces cerevisiae (Hirata et al., J. Biol. Chem. 265:6726 

30 (1990); Kane et al., Science 250:651 (1990)) and the recA gene from 

Mycobacterium tuberculosis (Davis et al., J. Bact. 173:5653 (1991); 
Davis et al., Ce//71:1 (1992)). Each contains internal in-frame peptide 
segments which must be removed to produce the mature protein. 
Expression of Tfp1 and Rec A each results in two peptides: one 

35 representing the intervening protein sequence (IVPS) and the other the 

ligated product of the external protein sequences (EPS). This post- 
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translational processing event has been termed "protein splicing". 
Similarly, the Vent DNA polymerase gene from the hyperthermophilic 
archaea Thermococcus litoralis contains two in-frame IVPSs (Perler, et 
al. f PNAS 89:5577 (1992)). 

5 

A major impediment to the development of methods of using IVPSs 
or protein splicing in other than research applications has been the 
inability to control the activity of the IVPS and thus the splicing event. 

10 Thus, it would be desirable to have a method which provides a 

ready means to modify a target protein using an IVPS, particularly where 
the activity of the IVPS is controllable. It would also be desirable to have 
a method which can specifically modify target proteins such that their 
activity is substantially inactivated. It would be desirable to have a 

1 5 method which can be used to restore the activity of an inactivated 

modified protein. 

SUMMARY OF THE INVENTION 

20 The present invention relates to modified proteins comprising an 

IVPS and a target protein, the IVPS being capable of excision by protein 
splicing, or cleavage in the absence of splicing, under predetermined 
conditions in either cis or in trans. Such predetermined conditions 
depend on the IVPS used and can include, for example, increase in 

25 temperature, changes in pH conditions, exposure to light, 

dephosphorylation, or deglycosylation of amino acid residues or 
exposure to chemical reagents which induce cleavage/splicing. The 
IVPS may be joined with the target protein either by inserting the IVPS 
into the target protein or fusing the IVPS with the target protein at either 

30 the amino or carboxy terminal end of the target protein. These IVPS, 

referred to as controllable intervening protein sequences (CIVPS), are 
therefore useful in controlling the splicing or cleavage reaction. The 
present invention further relates to methods for producing, selecting and 
testing CIVPSs. 

35 
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In one preferred embodiment, a DNA sequence encoding a CIVPS 
is inserted into, or joined with, a DNA sequence encoding a target 
protein such that both coding sequences form a continuous open 
reading frame. Thereafter, expression of this fusion DNA is utilized to 
5 produce the modified target protein. In another embodiment, the 

modified protein so produced is subjected to predetermined conditions 
under which the CIVPS will be excised or cleaved. In certain 
embodiments, the CIVPS is inserted into a region of the target protein 
which renders the target protein substantially inactive and excision of the 
1 0 CIVPS restores the activity of the target protein. 

Preferred CIVPSs include CIVPS1 and 2 obtainable from T. 
litoralis (also sometimes referred to as Vent IVPS 1 and 2 or IVS1 and 2) 
and CIVPS 3 obtainable from Pyrococcus sp.(also sometimes referred to 
15 as Deep Vent IVPS1 or IVS1). These CIVPSs are capable of excision, 

i.e., removal via protein splicing, from modified proteins upon an 
increase in temperature. Other preferred CIVPSs include those 
obtainable from yeast such as Saccharomyces cerevisiae. 

20 In accordance with the present invention, it has also been found 

that certain CIVPS amino acid residues and at least the first downstream 
amino acid residue modulate the splicing reaction and that modification 
of these residues decreases or stops the splicing reaction. These 
residues have been shown to be conserved in other IVPSs. Modification 

25 of such residues can be used to convert a IVPS to a CIVPS. 

In accordance with the present invention, it has been found that in 
certain situations, the complete splicing reaction is not necessary or 
desirable. In such situations, the CIVPS can be modified to allow 
30 cleavage in the absence of splicing, thus allowing for controlled 

separation or cleavage of the CIVPS from the target protein. 

The potential uses for the modified proteins and CIVPSs of the 
present invention are manifold. These include, for example, control of a 
35 target protein's enzymatic activity, purification of modified proteins using 
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antibodies specific to the CIVPS by affinity chromatography and 
production of proteins that are toxic to host cells. 

The CIVPSs of the present invention may further be used in a 
5 method of protein purification in which a modified protein comprising a 

target protein fused to a CIVPS is produced. If desired, a three-part 
fusion can be produced in which the CIVPS is between the target protein 
and a protein having affinity for a substrate (binding protein), e.g., MBP. 
The modified protein is then contacted with a substrate to which the 

10 CIVPS or binding protein has specific affinity, e.g., using affinity 

chromatography. The highly purified target protein can be liberated from 
the column by subjecting the CIVPS to predetermined conditions under 
which cleavage, for example, between the CIVPS and the target protein 
is initiated. Alternatively, the fusion protein can be purified as above and 

1 5 then the target protein released from the fusion by subjecting the CIVPS 

to predetermined conditions. 

BRIEF DESCRIPTION OF TH F. DRAWINGS 

20 Figure 1 depicts the amino acid sequence (SEQ ID NO:30, SEQ ID 

NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34, SEQ ID NO:36, 
SEQ ID NO:37, SEQ ID NO:38 and SEQ ID NO:39) of proposed protein 
splice junctions. Amino-terminal (top) and carboxy-terminal (bottom) 
splice junctions are shown with splice sites indicated by arrows and 

25 conserved or similar amino acids boxed. 

Figure 2 illustrates insertion of IVPS into the EcoRV site of the B- 
galactasidase gene. PCR products of either Deep Vent IVPS1 (CIVPS3) 
or Vent IVPS2 (CIVPS2) are ligated to EcoRV digested pAHOS between 
30 the Asp and lie residues of B-galactosidase to produce a modified 3- 

galactosidase product. 

Figure 3A and 3B are graphs showing that splicing of modified B- 
galactosidase yields active B-galactosidase. Incubation of crude 
35 extracts from hosts expressing the indicated IVPS-B-galactosidase 
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fusion proteins at 42°C yields an increase in enzyme activity with time, 
whereas incubation at 42°C with the host alone (RR1) or an unmodified 
B-galactosidase construct (pAH05) shows no increase in enzyme 
activity. 

5 

Figure 4 is a western blot showing the results of temperature 
controlled protein splicing experiments. CIVPS2 and CIVPS3 were 
cloned into the EcoRV site of B-galactosidase. Western blot examination 
of cell extracts with sera directed against B-galactosidase or the CIVPS 

10 protein (I- 77/1 and l-Pspl, respectively) detects modified B-galactosidase 

fusion protein (Lanes 1 ,4,7,10). Treatment of extracts at 42°C (Lanes 
2,5,8,1 1) or 50°C (Lane 12) for 6 hours results in splicing and the 
production of free CIVPS proteins and unmodified B-galactosidase 
(except for retained serine or threonine residue, see text example 2 & 3). 

15 Unmodified B-galactosidase from pAH05 is in lane 6. Lanes 3 & 9 

contain size markers. 

Figure 5 shows by western blot examination of cell extracts with 
sera directed against B-agarase, the detection of modified B-agarase 
20 fusion protein. Lanes 1 & 4: size markers; Lanes 2 & 5: B-agarase 

standard; lane 3: CIVPS2-B-agarase fusion; lane 6: CIVPS3-B-agarase 
fusion. 

Figure 6 illustrates insertion of IVPS2 (CIVPS2) into the B- 
25 galactasidase gene by creation of new restriction sites (BspEI and Spel) 

within the IVPS by silent mutations. 

Figure 7 illustrates insertion of either Deep Vent IVPS1 (CIVPS3) 
or Vent IVPS2 (CIVPS2) into the B-galactasidase gene by creation of 
30 new restriction sites (Xba\ and Sail) by silent mutations within the target 

gene. 



Figure 8 is a plasmid map of pANG5. 
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Figure 9 is an autoradiogram of SDS-PAGE showing suppressor 
tRNA-mediated incorporation of a chemically blocked serine at the 
upstream junction of CIVPS2. 

5 Figure 10 is an autoradiogram of SDS-PAGE showing the splicing 

reaction of CIVPS2 initiated by visible light irradiation of a chemically 
blocked precursor protein. 

Figure 1 1 is a gel showing temperature controlled protein splicing 
10 and cleavage. Deep Vent IVPS1 (CIVPS3) cassettes were cloned into 

the EcoRV site of B-galactosidase. Western blot analysis was used to 
examine cell extracts of pDV7 (CVPS3 cassette, lanes 1-3), pDVC302 
(CIVPS3/Cys cassette, lanes 4-6), pDVT321 (CIVPS3n"hr cassette, 
lanes 7-9) and pDVS712 (CIVPS3/Ser cassette, lanes 10-12). Antibody 
15 directed against the CIVPS3 protein (l-Pspl) (NEB) detects fusion 

proteins and cleavage products including free CIVPS3, N-EPS-CIVPS3 
and CIVPS3-C-EPS (from cleavage at one of the splice junctions). The 
untreated extracts were in lanes 1 . 4, 7, and 1 0. Treatment of extracts at 
42°C (lanes 2, 5, 8, and 1 1 ) or 65°C (lanes 3. 6, 9, and 1 2) for 2 hours 
20 results in increased splicing and/or cleavage activity at different 

efficiency. 



Figure 12 is a Western blot showing temperature controlled protein 
splicing and cleavage. Western blot analysis using antibody directed 

25 against l-Pspl and B-galactosidase (C-EPS domain) (Promega) were 

used to examine fusion constructs pDVC302 (lanes 1-3), pDVT321 
(lanes 4-6) and pDVS712 (lanes 7-9). Treatment of extracts at 42°C 
(lanes 2, 5. and 8) or 65°C (lanes 3, 6, and 9) for 2 hours results in 
splicing (in pDVS712) or cleavage. Protein splicing in pDVS712 extract 

30 produced free CIVPS3 protein, l-Pspl and unmodified B-galactosidase 

(except for retained serine). Lane 1 contains size markers. 

Figure 13A and 13B show the purification of MIP precursor on 
amylose and MonoQ columns examined by Coomassie blue staining 
35 and immunoblot. The diagram between Parts A and B represents the 
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proposed structure of each band, including the branched molecule MIP*. 
The black boxes represent the MBP domain, the white boxes the IVPS 
doman and the gray boxes the paramyosin ASal domain. The pluses (+) 
indicate that the sample was heat treated at 37°C for 120 min., minuses 

5 (-) indicate that the sample was not heat treated. Part A: Coomassie 

blue stained gel. Total, crude supernatants from MIP cultures. F.T., 
amylose resin flow through. Amylose eluate (-), amylose resin purified 
MIP preparations. Amylose eluate (+), the amylose eluate in lane 4 was 
treated at 37°C for 120 min. to induce splicing. MonoQ, MonoQ purified 

10 sample. After chromatography on MonoQ, recovery of MBP-CIVPS3 (Ml) 

was variable and generally low. Symbols are as follows: MIP*, 180 kDa 
apparent molecular mass branched molecule; MIP, 132 kDa precursor; 
single splice junction cleavage products (Ml, MBP-CIVPS3; IP, CIVPS3- 
paramyosin ASal; M, MBP); and spliced products (MP, MBP-paramyosin 

1 5 ASal and I, CIVPS3 = Pl-Pspl). Part B: Immunoblots. The MonoQ 

sample from Part A was heat treated as above and electrophoresed in 
triplicate. MlP-related proteins were identified by immune reactivity with 
anti-MBP sera, anti-paramyosin sera and anti-PI-Pspl (anti-CIVPS3) 
sera. 

20 

Figure 14 illustrates the replaceable splice junction cassettes in 
MIP21 fusion. pMIP21 contains two unique restriction sites flanking each 
splice junction. Splice junctions are indicated by arrows. Amino acid 
residues around the splice junctions are shown. Splice junctions can be 
25 changed by replacing either the amino terminal Xho\-Kpn\ cassette or 

the carboxyl terminal BamH\-Stu\ cassette with another DNA cassette. 

Figure 15A and 15B is a gel showing thermal inducible cleavage 
at a single splice junction from modified MIP fusions. Fusion proteins 
30 were purified using amylose resin columns. 

Figure 15A shows cleavage at the C-terminal splice junction from 
MIP23 fusion. Purified fusion protein samples were incubated at 4°C, 
37°C, 50°C or 65°C for 1 hour. Products were analyzed by a 4/20% 
SDS-PAGE followed by Coomassie blue staining. Cleavage of the 
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C-terminal splice junction of the MIP23 fusion protein (MIP) yielded 
MBP-CIVPS (Ml) and paramyosin ASal (P). 

Figure 15B shows cleavage at the N-terminal splice junction from 
MIP28 fusion. Purified protein samples were incubated at 4°C, 42°C, 
50°C or 65°C for 1 hour. Products were analyzed by a 4/20% 
SDS-PAGE followed by Coomassie blue staining. Cleavage of the 
N-terminal splice junction of the MIP28 fusion protein (MIP) yielded MBP 
(M) and CIVPS-paramyosin ASal (IP). Size standards ( in kilodaltons) 
are shown on the left side. 



Figure 16 is a gel showing thermal inducible cleavage of MIC 
fusion. Purified fusion protein samples were incubated at 4°C, 37°C, 
50°C or 65°C for 1 hour. Products were analyzed by a 4/20% 
SDS-PAGE followed by Coomassie blue staining. Incubation of MIC 
1 5 fusion protein (MIC) yielded formation of ligated product, MBP-CBD(MC), 

and excised product, Deep-Vent IVPS1(l=l-Psp I). Also, cleavage 
products, MBP-Deep-Vent IVPS1(MI) and Deep-Vent IVPSI-CBD(IC), 
are present in all samples and do not change with this heat treatment. 

20 Figure 17A and 17B show the Western blot of a frans-splicing 

reaction with Ml' and PP. I'P and Ml' were treated as described in the 
text to induce frans-splicing as observed by the accumulation of MP and 
I* products. Western blots with either anti-CIVPS3 (Anti-Pi-Pspl) sera or 
anti-Paramyosin sera were performed as described in the text. Lanes 

25 marked '4°' contain control I'P and Ml' samples incubated at 4°C. Lanes 

3-7 contain cleavage reaction samples after incubation for 0, 5, 10, 20, 
and 30 minutes at 42°C, respectively. Lane S contains size markers 
(NEB broad range prestained protein markers). 

30 Figure 18 shows that frans-splicing re-establishes l-Pspl 

endonuclease activity. Xmnl linearized pAKR7 DNA was digested 
with 0.01, 0.1 or 1 p.g of either Ml', I'P, the frans-splicing reaction 
products (indicated by a plus in both the I'P and Ml" rows) or cis- 
spliced MIP52. I-Pspl activity was only present in MIP52 and the 

35 trans -spliced mixture. Lane S contains size markers (a mixture of 
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30 



lambda DNA digested with Hindlll and PNX174 DNA digested 
with Haelli). 

Figure 19 shows the frans-cleavage of I'P by Ml'22. I'P and Ml'22 
were treated as described in the text to induce frans-cleavage. Lanes 1 
and 2 contain the starting samples Ml'22 and I'P, respectively. Lane 3 
contains size markers (NEB broad range protein markers). Lanes 4-9 
contain cleavage reaction samples from 0, 5, 10, 20, 40 and 90 minutes 
at 42°C, respectively. 



Figure 20 illustrates the chemical activation of cleavage at the N- 
terminal splice junction from the MI94 fusion containing the SeMCys 
substitution. Purified protein samples were incubated at 37°C with (+) or 
without (-) 0.25 M hydroxylamine. Products were analyzed by a 4-12% 
15 SDS-PAGE followed by Coomassie blue staining. Cleavage at the N- 

terminal splice junction of the MI94 fusion protein (Ml) yielded MBP (M) 
and CIVPS3 (I). Size standards (in kilodaltons) are shown on the left 
side. 

2o Figure 21 illustrates pMYB129 fusion construct carrying N454A 

substitution. 

Figure 22A and 22B illustrates one-step purification of the target 
protein (MBP) by chitin. Cleavage is induced by 30 mM DTT at pH 7.6 at 
25 4°C at 16 hours. Size markers (NEB) (on the left); lane 1 : cell lysate; 

lane 2: flow-through lysate; lane 3: DTT-induced cleavage product, MBP 
(M); lane 4: 6M guanidine wash. 



Figure 23A and 23B show activation of cleavage of MYB fusion 
protein by B-mercaptoethanol (I3-ME) (Figure 23A) and DTT (Figure 
23B). 

Figure 24 illustrates the reaction vessel dimensions and set up 
used in the preparation of chitin beads. 



35 
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DETAILED DE SCRIPTION OF THE INVENTION 

The present invention is directed to modified proteins and methods 
of their production. The modified proteins comprise a controllable 
intervening protein sequence (CIVPS) and a target protein, the CIVPS 
being capable of excision by protein splicing, or cleavage in the 
absence of splicing, under predetermined conditions, e.g., increase in 
temperature, changes in pH conditions, unblocking of amino acid 
residues by photolysis, dephosphorylation, deglycosylation, treatment 
with chemical reagents or other means. If desired, the modified protein 
can be subjected to these conditions. The CIVPS may also be inserted 
into a region that substantially inactivates target protein activity. 

Intervening protein sequences (IVPS) are internal in-frame peptide 
segments found within a precursor protein which are removed or 
excised via protein splicing to form the native protein. IVPSs have been 
described in the TFPI allele from Saccharomyces cerevisiae (Hirata et 
al.. supra; Kane et al., supra) and recA gene from Mycobacterium 
tuberculosis (Davis et al., supra (1991)\ Davis et al., supra (1992)). The 
disclosure of these references are herein incorporated by reference. 

CIVPSs of the present invention include any intervening protein 
sequence in which excision or cleavage can be controlled, either by 
inherent properties of the native IVPS, such as an increase in 
temperature, or by modifications made to an IVPS that allow the reaction 
to be controlled. 

The Vent DNA polymerase gene from the hyperthermophilic 
archaea Thermococcus litoralis contains two in-frame IVPSs, IVPS1 
(CIVPS1) and IVPS2 (CIVPS2), (Perler, et al. supra) that can be deleted 
at the DNA level without affecting the kinetic and biochemical properties 
of the expressed polymerase. Correct processing of the Vent DNA 
polymerase gene containing both IVPSs occurs in the native archaea, T. 
litoralis. In addition, correct processing of expression constructs lacking 
IVPS1 has been observed in eubacterial E. co// (Perler, et al., supra), in 



97/01642 



-11- 



PCT/US96/10545 



eukaryotic bacuiovirus-infected insect cell and in vitro 
transcription/translation systems (Hodges, et al., Nucleic Acids 
Research, 20:6153 (1992)). Furthermore, rabbit reticulocyte and E. coli 
in vitro transcription/translation systems correctly remove IVPS2 
sequences to produce the mature polymerase. While not wishing to be 
bound by theory, it is believed that the Vent and Deep Vent IVPSs are 
self splicing. 

The nucleotide sequence for the Vent DNA polymerase gene is set 
out in the Sequencing Listing as SEQ ID NO:1. The nucleotide 
sequence for CIVPS1 is from nucleotide 1773 to 3386. The nucleotide 
sequence for CIVPS2 is from nucleotide 3534 to 4703. CIVPS1 and 
CIVPS2 can be obtained from phage NEB 619, which was deposited 
with the American Type Culture Collection (ATCC) on April 24, 1990 and 
received ATCC accession number 40795. 

A third IVPS (CIVPS3 or DV IVPS1), has been found by the 
present inventors in the DNA polymerase gene of the thermophilic 
archaebacteria, Pyrococcus species (isolate GB-D). The Pyrococcus 
DNA polymerase is sometimes referred to as Deep Vent DNA 
polymerase. The nucleotide sequence of the Deep Vent DNA 
polymerase is set out in the Sequence Listing as SEQ ID NO: 2. The 
nucleotide sequence for CIVPS3 is from 1839 to 3449. CIVPS3 can be 
obtained from plasmid pNEB #720 which was deposited with the ATCC 
on October 1 , 1991 and received ATCC accession number 68723. 

In accordance with the present invention, it has been found that the 
above CIVPS1, CIVPS2 and CIVPS3 are capable of excision from 
modified proteins upon an increase in temperature. For example, the 
CIVPSs are excised at reduced rates at temperatures from 37°C and 
below, but undergo excision more efficiently at temperatures from about 
42°C to 80°C. Preferred excision temperatures are between about 42°C 
and 60°C. Most preferably, predetermined excision conditions are 
experimentally determined taking into consideration temperatures at 
which the target protein will not denatu. or undergo thermal 
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inactivation. The modified proteins can be subjected to the 
predetermined temperatures for a period of time ranging from less than 
one minute to several hours. In certain situations, depending on the 
thermal sensitivity of the target protein, it may be desirable to increase 
the incubation time period while decreasing the temperature. 

Additionally, different modified proteins may exhibit differences in 
splicing efficiency at various temperatures. If necessary, the optimum 
temperatures for isolation and splicing of each modified protein can be 
experimentally determined. If the CIVPS splices at too low a 
temperature for a proposed purpose, the CIVPS can be modified, or its 
position in the target protein changed such that the optimum splicing 
temperature is increased. If the optimum splicing temperature for a 
particular modified protein is about 37°C, in order to insure that the 
modified protein does not splice in vivo, and thus increase the yield of 
intact modified proteins, host cells can be grown and the modified 
protein purified at lower temperatures, e.g., 12°C-30°C. This can also be 
accomplished by mutating the splicing element to shift the splicing 
temperature optimum from, for example, 30°C-37°C to 42°C-50°C, and 
thus resulting in a reduced level of splicing at physiological temperature. 

Other IVPSs can be isolated, for example, by identifying genes in 
which the coding capacity is significantly larger than the observed 
protein and that encodes a protein sequence not present in the mature 
protein. A protein containing an IVPS can be distinguished from a 
protein having a "pre-pro" precursor in that the mature protein will still 
have the N-terminal and C-terminal sequences of the IVPS containing 
precursor. Additionally, IVPSs can be detected by the absence of motifs 
that are conserved in certain protein families, e.g., DNA polymerases. 
The absence of such a motif may indicate that an IVPS is interrupting 
that motif (Perler et al., supra). Suspected IVPSs can be screened by 
inserting the suspected protein sequence into a marker protein, e.g., 13- 
galactosidase, such that the insertion decreases marker protein activity. 
The resulting modified protein can then be evaluated at periodic 
intervals for an increase in marker protein activity. See, Example 1-3. 
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Once identified, the DNA encoding the IVPS can be isolated and 
manipulated using standard DNA manipulation techniques. 

Chemical activation of splicing or cleavage may be accomplished 
5 by reacting the CIVPS of interest with a chemical reagent which 

enhances or induces splicing or cleavage. In one preferred 
embodiment, splicing or cleavage is controlled by employing one or 
more chemical reagents in a two-step process which first inactivates 
cleavage or splicing by mutation of the CIVPS or any other means, and 
10 then activates cleavage or splicing by addition of a chemical reagent, 

such as hydroxylamine, B-mercaptoethanol or dithiothreitol, for example. 
Control of cleavage or splicing by chemical reagents can be applied to 
both cis and trans CIVPS reactions. 



15 The chemical reagent employed depends, in part, on whether 

cleavage is occurring at the N-terminus or the C-terminus. While not 
wishing to be bound by theory, N-terminal cleavage is believed to 
involve an ester or thioester formation between the N-terminal domain 
and the IVPS. Accordingly, any chemical reagent which facilitates 

20 cleavage of the ester or thioester such as hydroxylamine (Bruice and 

Benkovic, Bioorganic Mechanisms, W.A. Benjamin, Inc., New York, 
(1966), the disclosure of which is hereby incorporated by reference 
herein) B-mercaptoethanol or dithiothreitol may be used to induce N- 
terminal cleavage. C- terminal cleavage is believed to involve cyclization 

25 of the IVPS C-terminal conserved asparagine. Accordingly, any reagent 

which increases the rate of cyclization of asparagine could be used to 
facilitate C-terminal cleavage. In a process referred to as noncovalent 
chemical rescue, an enzyme can be mutated, resulting in an inactive 
form of the enzyme. The activity can then be restored by adding a 

30 chemical reagent to the reaction mixture. See, for example, Toney and 

Kirsch {Science, 243:1485-1488 (1989), the disclosure of which is 
hereby incorporated by reference herein). This process of noncovalent 
chemical rescue of cleavage activity in CIVPS3 is described in Example 
14. Noncovalent chemical rescue of enzyme activity by a chemical 

35 reagent can be potentially applied to CIVPS cleavage or splicing 
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mutants at the primary mutation or after introduction of a second 
mutation at many different possible amino acid residues in the CIVPS 
using the appropriate chemical reagents for each type of mutation 
(Toney and Kirsch, supra). 

While not wishing to be bound by theory, C-terminal cleavage is 
believed to involve cyclization of the IVPS C-terminal conserved 
asparagine. Accordingly, any reagent which increases the rate of 
cyclization of asparagine could be used to facilitate C-terminal cleavage. 

IVPSs may also be identified by a larger open reading frame than 
observed in the mature protein and the presence of a region which has 
some of the following properties: (1) similarity to HO endonuclease or 
other homing endonucleases, (2) the amino acid sequence (Ala/Val) His 
Asn (Ser/Cysmu) (SEQ ID NO:45). 

CIVPSs of the present invention also include IVPSs which have 
been modified such that the splicing reaction can be controlled. As 
shown in Figure 1 (SEQ ID NO:30, SEQ ID NO:31 , SEQ ID NO:32, SEQ 
ID NO:33, SEQ ID NO:34 ( SEQ ID NO:35, SEQ ID NO:36, SEQ ID NO:37 ( 
SEQ ID NO:38, and SEQ ID NO:39), the aligned splice junctions of 
known protein splicing IVPSs reveal several similarities. In particular, 
-OH and -SH side chains are found on residues at the C-terminal side of 
both splice junctions, preceded by the dipeptide His-Asn at the 
downstream splice junction. 

While not wishing to be bound by theory, it is believed that 
hydroxyl/sulfydryl groups participate in the splicing reaction and thus 
modification of these residues modulate the splicing reaction. Such 
modifications can be evaluated by inserting the modified CIVPS into a 
marker protein, e.g., B-galactosidase, such that the insertion decreases 
marker protein activity. The resulting modified protein can then be 
evaluated at periodic intervals and under controlled conditions for an 
increase in marker protein activity. See, Example 1-3. In addition, 
Western blot analysis can be used to evaluate splicing and cleavage 



WO 97/01642 




PCT/US96/10545 



products. See, Example 8. Once identified, the DNA encoding the 
CIVPS can be isolated and manipulated using standard DNA 
manipulation techniques. 

5 In accordance with the present invention, it has been found that 

single amino acid changes at the serine 1082 of CIVPS2 slowed or 
blocked the protein splicing reaction. Specifically, the threonine 
substitution mutant displayed 10% of the polymerase activity of the wild- 
type enzyme, while the cysteine and alanine substitution mutants gave 

10 no detectable activity. However, a reaction product corresponding to 

cleavage at the altered splice junction was observed. This species 
accumulated in a mutant which replaced the serine at the splice junction 
with cysteine, but was unaltered when serine was replaced with either 
threonine or alanine. Wild-type CIVPS2 showed accumulation of a 

1 5 species of the size expected for cleavage at the carboxy terminal splice 

junction during the splicing reaction, although accumulation of this 
product decreased, but was still observed, when serine 1082 was 
changed to threonine, cysteine, or alanine. The S1082A variant showed 
no evidence of protein splicing, but still produced this product. 

20 

Mutagenesis at the carboxy-terminal splice junction, namely amino 
acid substitutions for the threonine 1472 (T1472) residue with serine 
produced patterns of splicing identical to the wild-type. Replacement of 
T1472 with alanine, glycine, or isoleucine gave no detectable splicing. 
25 When asparagine 1471 was replaced with alanine, no splicing was 

observed, but evidence of cleavage at the amino splice junction was 
observed. Table 1 , set forth below, summarizes the effects of amino acid 
substitutions on splicing and cleavage in CIVPS2. 

30 Accordingly, cleavage at the CIVPS splice junctions can be 

accomplished in the absence of protein splicing, thus allowing for 
controlled separation of the CIVPS from the target protein. In certain 
situations, such activity is desired. In these situations, the CIVPSs of the 
present invention may also encompass autoproteolytic proteins, such as 

35 autoproteolytic proteases, for example, retroviral proteases such as the 



BN8DOCID: <WO 9701ft42A1J_> 



WO 97/01642 




PCT/US96/10545 



TABLE 1 



N-terminal cleavage C-terminal cleavage 
I i 


WT aa residue 
residue number 


S 
1082 


N 
1471 


I T 
I 1472 


splicing observed 


T 




I s 


up/downstream 
junction cleavage 


C 




L c 


upstream junction 
cleavage 




q.d.a 




downstream junction 


a 




i 


no cleavage or splicing 






I l.a.G, 
I stop 



5 The effect of single amino acid substitutions on protein splicing was evaluated 

using pulse-chase analysis of Vent DNA polymerase containing IVPS2 in an E. coli 
expression system (Hodges, et al., supra (1992)). Arrows indicate the locations of the 
splice junctions. Small case letters indicate the effects are seen only after overnight 

n incubation, as opposed to being seen within 2 hours for other samples. Where splicing 

1 0 ls observed, cleavage products from both C- and N-terminal cleavage are also found. 

HIV-1 protease (Louis, et al., Eur. J. Biochem., 199:361 (1991)) and 
Debouck, et a!., Proc. Natl. Acad. Sci. USA, 84:8903-8906 (1987)). The 
skilled artisan is familiar with other such proteins. See, Krausslich, et al., 
15 Ann. Rev. Biochem., 701-754 (1988). Such proteins can be modified, in 

accordance with the disclosed methodology, such that the proteolytic 
activity is inducible under predetermined conditions. 

Modification of the CIVPS amino acids, including splice junction 
20 amino acids, can be accomplished in a number of ways. For example, 

the sequence surrounding the amino acid residue to be modified may be 
altered to create a biological phosphorylation site allowing it to be a 
substrate for specific kinases and phosphatases. Examples of protein 
kinase include, for example, casein kinase li, cAMP-dependent protein 
25 kinase, cdc2, and pp60 c " src (Pearson and Kemp, Methods in 

Enzymology 200:62 (1 991 )). Examples of phosphatases include, for 
example, protein phosphatase 2A, lam^ ia phosphatase, and the yop 
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phosphatase from Yersinia (Tonks, Civrrenf Op/nton /n Ce// Biology, 
2:1114 (1990)). 

Using CIVPS2 as an example, as set forth in Example 6C, an 
arginine residue was placed at position 1079 to create a consensus 
Calmodulin-dependent protein kinase II site (XRXXS*; Pearson et al.. 
supra) The protein splicing reaction may then be regulated by the 
degree of phosphorylation, using a kinase to create phosphoserine and 
block the splicing, and phosphatases to remove the phosphate, restoring 
the wild type serine and, consequently, protein splicing. 

Additionally, critical splice junction residues can be modified 
chemically such that the splicing reaction is blocked until the 
modification is reversed. This can be accomplished by using, for 
example, unnatural amino acid mutagenesis (Noren, et al., Science 
244:1 82 (1989); Ellman, et al., Methods in Enzymology 202:301 (1991)). 
Using this method, one of the amino acids involved in the splicing 
reaction can be replaced, during translation, by a synthetic derivative in 
which the side chain functionality of the side chain is "masked" by a 
chemically or photolytically removable group. For example, as set forth 
in Example 7, serine 1082 of CIVPS2 was modified by this method as 
follows: An amber stop codon was introduced into the Vent polymerase 
gene at the position corresponding to serine 1082 (see Example 6D). 
This gene was then added to an in vitro transcriptionAranslation system 
(Ellman. et al., supra) that had previously been demonstrated to support 
protein splicing of the wild-type gene. In the absence of a tRNA to read 
through this codon, only truncated product was expected. When an 
amber suppressor tRNA that had been chemically aminoacylated with 0- 
(o-nitrobenzyl) serine was added to the system, translation was able to 
continue past this codon. resulting in site-specific incorporation of the 
modified serine. As expected, only full-length precursor was observed, 
indicating that the splicing reaction was blocked (Figure 9). The o- 
nitrobenzyl group is removable by brief irradiation at 350 nm (Pillai, 
Synthesis 1 (1990)), so the blocked precursor would be expected to 
splice normally following irradiation. When the blocked precursor was 
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exposed to visible light to free the serine and then incubated to allow the 
splicing reaction to occur, spliced product was clearly seen (Figure 10). 

This strategy could also be applied to threonine 1472, which is 
5 found at the downstream splice junction of CIVPS2, as well as any other 

residue in which either the chemical functionality of the side chain is 
required for splicing, or introduction of a bulky group at that position 
would interfere with splicing sterically. Blocking groups can be chosen 
not only on the basis of the chemistry of the side chain to be protected, 

1 0 but also on the desired method of deblocking (chemically or 

photolytically). For example, the cysteine groups present in other 
examples of protein splicing (Figure 1) have thiol side chains that could 
be blocked using, for example, disulfide exchange (e.g., with 
dithiodipyridine) or complexation with transition metal ions (e.g., Hg 2+ ). 

15 See, Corey and Schultz, J. Biological Chemistry 264:3666 (1989). The 

resulting blocked precursors could then be activated for splicing by mild 
reduction or addition of metal chelators, respectively. 

It has been shown that IVPS1 and IVPS2 each encodes an 
20 endonuclease, I-Tii-II and l-Tli-l. respectively. In addition, DV IVPS1 also 

encodes an endonuclease, l-Pspl, which is inserted at the same position 
in the DV DNA polymerase gene as IVPS1 is in the Vent DNA 
polymerase gene and is 62% identical to the Vent IVPS1 gene. It has 
been found that the I VPS open reading frames in Tfp1, M. tuberculosis 
25 recA, Vent and Deep Vent DNA polymerase have protein sequence 

similarity to homing endonucleases, a class of intron-encoded proteins 
capable of cleaving alleles which lack the intron. (Hirata et al., supra, 
Kane et al., supra, Davis et al., supra, Perler et al., supra) 

30 Certain host cells may not be able to tolerate the gene product of 

the CIVPS and thus, in some embodiments it may be preferable to 
inactivate the endonuclease function. In accordance with the present 
invention it has been shown that protein splicing can occur when the 
CIVPS endonuclease function has been inactivated. Such inactivation 

35 can be accomplished in a variety of ways, including for example, random 
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mutagenesis, deletion or insertional inactivation, or site directed 
mutagenesis. Preferably, the endonuclease function is inactivated by 
site directed mutagenesis. I-Tli-I shares sequence similarity with other 
"homing endonucleases" in the pair of characteristic dodecapeptide 
motifs (Cummings et al., Curr. Gent 16:381 (1989)). As shown in 
Example 6B, endonuclease activity was inactivated by oligonucleotide- 
directed mutagenesis of a single residue (aspartate 1236 to alanine) 
within one of these motifs. Substitution of alternative residues could 
also reduce or abolish endonuclease activity without affecting protein 
splicing. Inactivation of endonuclease function has been shown to 
increase the stability of constructs carrying the modified proteins. 

Target proteins which can be used in accordance with the present 
invention include, for example, enzymes, toxins, cytokines, glycoproteins 
and growth factors. Many such proteins are well known to the skilled 
artisan. The amino acid and nucleotide sequence of such proteins are 
easily available through many computer data bases, for example, 
GenBank, EMBL and Swiss-Prot. Alternatively, the nucleotide or amino 
acid sequence of a target protein can be determined using routine 
procedures in the art. 

If it is desirable to substantially inactivate target protein activity, the 
CIVPS is inserted into a region(s) that will inactivate such activity. Such 
regions are well known to the skilled artisan and include, for example, 
binding sites, enzyme active sites, the conserved motifs of proteins, e.g., 
DNA polymerases, and dimerization or multimerization sites. 
Alternatively, the CIVPS may be inserted randomly and the activity of 
each modified protein measured until the desired level of activity is 
obtained. Preferably, such a modified protein has about a 50% reduced 
level of activity compared to the native protein. More preferably about 
75%. Still more preferably greater than 99%. 

The CIVPS may be inserted into the target gene by any number of 
means. Preferably, to assure proper protein splicing if the CIVPS is 
excised, it is important to insert the CIVPS immediately before a proper 
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splice junction residue because excision of the CIVPS leaves that amino 
acid at the splice junction. This can be accomplished by either inserting 
the CIVPS immediately before the appropriate splice junction amino 
acid or by modifying the CIVPS such that it "brings" the appropriate 
5 amino acid with it. 

For example, CIVPS1 , 2 or 3 can be inserted immediately before 
the appropriate splice junction amino acids, for example, serine, 
threonine or cysteine residues, most preferably before serine or 
10 threonine. See, Figure 1. Such sites are readily available in most target 

proteins. 

In certain situations, such as when the target protein is a toxin, it 
may be desirable to further control protein splicing by adding a 
15 secondary control. This may be accomplished by inserting the CIVPS 

before a less optimal amino acid, for example, one that the CIVPS does 
not normally precede and thus may slow down the splicing reaction. 

As set forth above, insertion can be at any site within the target 
20 protein if the CIVPS "brings" the appropriate downstream amino acid 

with it. This can be accomplished by creation of CIVPS DNA having a 
codon for the desired downstream amino acid. Methods for producing 
such DNA are set out in detail below. This DNA can then be inserted at 
any site within the target DNA. Upon protein splicing of the resulting 
25 modified protein, the extra residue brought by the CIVPS will be left 

behind. Thus, if activity of the final product is important, the skilled 
artisan must takes steps to assure that the extra residue will not be left in 
an area of the target protein that will adversely affect activity. 

30 The CIVPS may be inserted into the target protein, or fused to the 

target protein, by chemically synthesizing the primary amino acid 
sequence of the target protein, including the CIVPS, inserted at any 
desired site, using standard methods (e.g., see Hunkapiller, et al., 
Nature 310:105 (1984)) and a commercially available protein 

35 synthesizer. 
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Alternatively, a DNA sequence encoding a CIVPS is inserted in, or 
fused to, a DNA sequence encoding for a target protein such that both 
coding sequences form a continuous reading frame. This can be 
accomplished using a variety of methods known to the skilled artisan, 
several of which are set out below. 

For example, the CIVPS DNA is inserted into any restriction 
enzyme site that makes a blunt cut in the target gene and which is in 
frame. This can be accompanied by first, synthesizing an CIVPS DNA 
fragment with a threonine codon (for Vent IVPS2) or a serine codon (for 
Deep Vent IVPS1 or Vent IVPS 1 ) at its 3* end. This fragment is then 
ligated in-frame to a linear plasmid cut to blunt ends by the restriction 
endonuclease. Using the lacZ DNA sequence, for example, an EcoRV 
site can be used to insert Vent IVPS2 or Deep Vent IVPS1 between 
residue 375 (aspartic acid) and 376 (isoleucine). See, Figure 2. 
However, as discussed above, using this method, if the CIVPS is 
excised an extra residue is expected to remain at the splice junction and 
therefore depending on where the CIVPS is inserted, the resulting 
protein may not have the same function or structure as the native protein. 

The CIVPS DNA could also be inserted by making silent mutations 
(preserving the amino acid residue) near one end or both ends of the 
CIVPS to create restriction sites compatible with the target gene. Using 
CIVPS2 as an example, a BspEI restriction site can be made near the 5* 
end and a Spel restriction site near its 3' end, by silent mutations. Using 
PCR primers overlapping the new restriction sites and continuing 
through the beginning of the lacZ target gene at either asp 594 or thr 
595, one can generate a lacZ fragment with compatible BspEI and Spel 
restriction sites. Then, the CIVPS is inserted between an aspartic acid 
codon (residue 594) and a threonine codon (residue 595) within the lacZ 
coding region. DNA fragment(s) can be synthesized from both the 
CIVPS and the target gene by PCR with their ends at the insertion site 
overlapping with the termini of the CIVPS, therefore, including the same 
restriction sites. After appropriate restn ion endonuclease treatment. 
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DNA fragments with compatible ends can then be ligated to create a 
fusion gene. Since no extra residue would be left after excision of the 
CIVPS, native polypeptide will form when splicing occurs. Preferably, 
the restriction site being created is unique within the CIVPS and within 
5 the target gene to avoid ligation of multiple fragments and thus, 

complicated screening procedures. 

If the plasmid vector carrying the target gene sequence is relatively 
small, for example, less than about 5Kb, a linear form of the plasmid can 

10 be generated using PCR, and then the linear plasmid can be ligated to 

the CIVPS gene. Using this method the CIVPS gene can be inserted at 
any location in the target gene as follows: First, plasmid DNA containing 
the target gene can be synthesized by PCR using a pair of primers 
starting at the insertion site, for example, serine or threonine codons for 

1 5 CIVPS1 f 2 and 3, or any codon if the CIVPS also brings the appropriate 

downstream amino acid. Next, the CIVPS gene (with or without serine 
or threonine) can be ligated to the linear plasmid DNA (without the 
serine or threonine codon). The required splice junction amino acids 
(serine or threonine) can be positioned on either the CIVPS fragment or 

20 on the target gene. The advantage of having the required amino acid on 

the CIVPS fragment when placing upstream of an endogenous serine or 
threonine is that the self-ligated vector DNA (without the CIVPS insert) 
may only express a deficient product of the target gene because of the 
deletion of the serine or threonine in the coding region. This may aid in 

25 phenotype selection for the fusion construct if the fusion protein can 

splice to produce a functional product. 

The fusion DNA encoding the modified protein can be inserted into 
an appropriate expression vector, i.e., a vector which contains the 

30 necessary elements for the transcription and translation of the inserted 

protein-coding sequence. A variety of host-vector systems may be 
utilized to express the protein-coding sequence. These include 
mammalian cell systems infected with virus (e.g., vaccinia virus, 
adenovirus, etc.); insect cell systems infected with virus (e.g., 

35 baculovirus); microorganisms such as yeast containing yeast vectors, or 
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bacteria transformed with bacteriophage DNA, plasmid DNA or cosmid 
DNA. Depending on the host-vector system utilized, any one of a 
number of suitable transcription and translation elements may be used. 
For instance, when expressing a modified eukaryotic protein, it may be 
advantageous to use appropriate eukaryotic vectors and host cells. 
Expression of the fusion DNA results in the production of the modified 
proteins of the present invention. 

Once obtained, the modified proteins can be separated and 
purified by appropriate combination of known techniques. These 
methods include, for example, methods utilizing solubility such as salt 
precipitation and solvent precipitation, methods utilizing the difference in 
molecular weight such as dialysis, ultrafiltration, gel-fi Itration, and SDS- 
polyacrylamide gel electrophoresis, methods utilizing a difference in 
electrical charge such as ion-exchange column chromatography, 
methods utilizing specific affinity such as affinity chromatography, 
methods utilizing a difference in hydrophobicity such as reverse-phase 
high performance liquid chromatography and methods utilizing a 
difference in isoelectric point, such as isoelectric focusing 
electrophoresis. 

If desired, the modified proteins can be subjected to predetermined 
conditions under which the C1VPS is excised. Such conditions depend 
on the CIVPS used. For example, CIVPS 1 , 2 and 3 are capable of 
excision by subjecting the modified protein to increased temperature, 
42°C - 80°C, most preferably, 42°C - 60°C. This can be accomplished 
using any known means, for example a water bath or a heat generating 
laser. The time period for incubation can range from less than one 
minute to greater than several hours. As discussed above, in certain 
situations, depending on the thermal sensitivity of the target protein, it 
may be desirable to increase the incubation period while decreasing the 
temperature. In addition, if in vivo splicing is desired, temperatures 
compatible with the growth of the host organism are preferred. 
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The present invention may be used to produce proteins that are 
highly toxic to the host cells by using the CIVPS to modifying a toxic 
target protein such that the modified protein is non-toxic. This can be 
accomplished, for example, by inserting the CIVPS into a region(s) 
responsible for toxicity. After isolation, the non-toxic modified protein 
can then be subject to predetermined condition under which the CIVPS 
will excise and the resulting toxin can be isolated. 

If a protein is extremely toxic to a host cell it may be desirable to 
produce that protein using a method referred to as H transp!icing H . Using 
this method the toxic protein is produced in two or more pieces in 
separate host cells, each piece being modified by insertion of a CIVPS. 
For example, a first modified protein can be produced comprising an 
amino portion of a target protein to which is inserted at its carboxy 
terminus an amino terminal fragment of a CIVPS, thereafter a second 
modified protein comprising the remaining portion of the target protein 
into which is inserted at its amino terminus the remaining fragment of 
CIVPSs. Alternatively, overlapping CIVPS fragments can be used. 
Each modified protein is then isolated from the host cells and incubated 
together under appropriate conditions for splicing of the CIVPS. This 
results in a ligated target protein. By dividing the target protein in two 
different hosts, there is no possibility that even a minute fraction will 
splice in vivo, adversely affecting the host. In addition, the entire CIVPS 
may be inserted on either side of the splice junction of the first modified 
protein and the remaining target protein fragment added to the splicing 
mixture. 

Accordingly, trans -splicing may allow expression of highly toxic 
genes in E. coli by expressing only an inactive portion of the target 
protein in each of two different hosts. The two complementary fragments 
can then be purified in large amounts and ligated together by in vitro 
trans -splicing. By dividing the CIVPS into 2 parts, its splicing activity is 
effectively controlled until the two parts are brought together. Therefore, 
any IVPS becomes a CIVPS when divided into 2 parts which are 
purified from different hosts and kept separate until splicing is required. 
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Trans-cleavage combines the properties of frans-splicing, CIVPS 
cleavage and the three part affinity-cleavage vector systems. In trans- 
cleavage, the CIVPS is separated into 2 fragments, which, when 
combined and activated, result in cleavage between the protein of 
interest and the CIVPS. In one envisioned application similar to that 
described for c/s-cleavage in Example 9, rrans-cleavage can be used for 
affinity purification of a protein of interest. In this application, one or both 
fragments of the CIVPS has an affinity tag for purification and a cloning 
site to make an in-frame fusion with the protein of interest. Each of the 
two constructs are grown and induced separately as described for trans- 
splicing. Protein from each of the two constructs is then purified either by 
standard chromatographic or affinity techniques. The two protein 
fragments are then combined under conditions which allow the two parts 
of the CIVPS to come together to form an active CIVPS. Cleavage is 
then induced by temperature, pH, chemical reagents or other means, 
releasing the purified protein of interest. 

In one embodiment, the combination of the two parts of the CIVPS 
can occur while one part is bound to a solid matrix; in this case, after 
activation of cleavage, the protein of interest is released from the solid 
matrix while the CIVPS and any affinity tag remain on the solid matrix. 
Under some conditions, the two CIVPS fragments will remain associated 
after the cleavage reaction, allowing both to remain bound to the solid 
support even though only one fragment has an affinity tag. One might 
also have affinity tags on both fragments of the CIVPS to allow 
separation of the protein of interest from the CIVPS fragments after 
cleavage. As in the case of the 3 part fusion described in Example 9, the 
order of the binding domain, the CIVPS and the protein of interest can 
be varied. All variations described for CIVPS purification and cleavage 
schemes can be applied to rrans-cleavage systems also. 

By using the same information obtained for c/s-cleavage on 
CIVPS fusions or by using new mutations, cleavage may be 
programmed to occur at either the N-terminal or the C-terminal of the 
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CIVPS. In the present example, the starting point for the CiVPS 
fragments are those described in Example 12. These C1VPS3 
fragments were converted to frans-cleavage reagents by cassette 
replacement as described in Example 10. Some of the many possible 
mutations which we have shown result in frans-cleavage at the C- 
terminal of CIVPS3 are Ala535 of CIVPS3 to Lys, Ser1 of CIVPS3 to Ala 
and Ile2 of CIVPS3 to Lys. Asn537 of CIVPS to Ala resulted in trans- 
cleavage at the N-terminal of CIVPS3. 

The IVPSs of the present invention may be used in a "protein 
ligation" to add unnatural amino acid residues, structural probes, 
identifying epitopes or tags, or other determinants to a target protein. For 
example, the target protein can be fused to the amino terminus of the 
IVPS. A stop codon can be placed immediately following the carboxy 
terminus of the IVPS. The peptide to be fused can then be added to the 
mixture. If necessary, in order to more closely mimic the native splicing 
mechanism, the amino terminus of this peptide may be serine, threonine, 
or cysteine. The splicing reaction may then proceed, pushed by mass 
action towards splicing of the product. 

The above reaction could also be adapted to occur with a starting 
material composed of the IVPS fused at the carboxy terminus to the 
amino terminus of the target protein. Initiation at a methionine 
engineered to precede the serine residue which begins in certain CIVPS 
would allow translation to occur which would likely be processed off in E. 
coli leaving an amino terminal serine residue. The peptide to be fused 
to the amino terminus of this target protein could then be added, and 
splicing allowed to proceed. Such an approach may be favored since 
there is no known requirement for the carboxy terminal residue on the 
peptide being added. Additionally, current experimental evidence 
suggests that cleavage of the upstream splice junction precedes the 
ligation reaction, indicating this approach more closely approximates the 
native reaction mechanism. Targeting peptides could also be added to 
the peptide to facilitate translocation of the fusion protein. 
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The present invention can also be used to study the effect of a 
target protein during a specific part of the cell cycle or under specific 
conditions such as induction of another protein or differentiation. For 
example, the chromosomal copy of a gene encoding a particular protein 
can be replaced with a version containing a CIVPS. At a specific point in 
the cell cycle, differentiation or other desired point, the cells are heated 
causing the precursor to splice, and thus the active target protein is 
present only at this point. 

The CIVPs of the present invention can also be used to isolate 
modified proteins by use of affinity chromatography with antibodies 
specific to the CIVPS. For example, monoclonal or polyclonal 
antibodies can be generated having binding affinity to a CIVPS using 
standard techniques. These antibodies can then be utilized in affinity 
chromatography purification procedures to isolate a modified protein. 
After purification, if desired, the modified proteins can be subjected to 
predetermined conditions under which the CIVPS will undergo excision. 

As discussed above, cleavage at the CIVPS splice junction can be 
accomplished in the absence of protein splicing, thus allowing for 
controlled separation of the CIVPS from the target protein. Such 
CIVPSs can therefore be used in a fusion protein purification system. 

Fusion protein purification systems are well known to the skilled 
artisan. See, EPO 0 286 239 and N.M. Sassenfeld, TIBTECH, 8:88-93 
(1990). Typically, in such systems, a binding protein and a target protein 
are joined by a linker having a protease recognition site. The fusion is 
then purified by affinity chromatography on a substrate having affinity for 
the binding protein. The binding protein and the target protein are then 
separated by contact with a protease, e.g., factor Xa. In these systems, in 
order to obtain a highly purified target protein, the protease must be 
separated from the target protein, thus adding an additional purification 
step, as well as the potential for contamination. The method of the 
present invention, by using a CIVPS, instead of a protease, avoids these 



WO 97/01642 




PCT/US96/10545 



and other problems encountered in currently used protein fusion 
purification systems. 

In the method of the present invention, a modified protein 
5 comprising a fusion protein in which a CIVPS is between the target 

protein and a protein having affinity for a substrate (binding protein) is 
formed. Techniques for forming such fusion proteins are well known to 
the skilled artisan. See, EPO 0 286 239 and J. Sambrook, et al., 
Molecular Cloning: A L aboratory Manual (1989), Cold Spring Harbor 
10 Laboratory Press, Cold Spring Harbor, NY, p. 17.29-17.33. 

Binding proteins which may be employed in the method of the 
present invention include, for example, sugar binding proteins, such as 
maltose or arabinose binding protein, receptor binding proteins, amino 
15 acids binding proteins and metal binding proteins. Other binding 

proteins are well known to the skilled artisan. See, EPO 0 286 239 and 
N.M. Sassenfeld, TIBTECH, supra. 

The modified protein is then contacted with a substrate to which 
20 the binding protein has specific affinity, e.g., using affinity 

chromatography. 

The highly purified target protein can be liberated from the column 
by subjecting the CIVPS to predetermined conditions under which 
25 cleavage is initiated, for example, between the CIVPS and the target 

protein. Alternatively, the purified fusion protein can be eluted from the 
column and liberated as above. 

The present invention is further illustrated by the following 
30 examples. These examples are provided to aid in the understanding of 

the invention and are not construed as a limitation thereof. 



All references cited above and below are herein incorporated by 
reference. 



35 
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EXAMPLE 1 

SYNTHESIS OF IVPS CASSETTES FOR INSERTION 
INTO BLUNT SITES BETWEEN TARGET GENE CODONS 

DNA fragments or cassettes for in-frame insertion of IVPSs into the 
lacL coding region or any other target gene can be prepared by 
polymerase chain reaction (PCR) with or without the first downstream 
external protein sequence (EPS) codon. The native downstream 
residues are serine for Deep Vent IVPS1 and Vent IVPS1 or threonine 
for Vent IVPS2. It has been found that IVPS2 can splice if it precedes a 
threonine or cysteine, although at reduced levels. Although not wishing 
to be bound by theory, it is believed that all the IVPSs can splice to some 
extent when preceding either serine, threonine or cysteine. Cassettes 
including the downstream serine or threonine can be inserted at any 
desired location in the target gene including preceding a serine or 
threonine. In the latter constructions, one may delete the serine or 
threonine from the target gene and substitute it with the incoming 
residue on the cassette. Cassettes lacking downstream serines, 
threonines or cysteines may be inserted prior to a serine, threonine or 
cysteine in the target gene. 

The following protocol describes the production of cassettes for 
Deep Vent IVPS1 (CIVP3) and Vent IVPS2 (CIVPS2) (endo+ and endo" 
versions), including the first downstream EPS codon. 

The PCR mixture contains Vent DNA polymerase buffer (NEB), 
supplemented with 2 mM Magnesium sulfate, 400 |iM of each dNTP, 0.9 

of each primer and 40 ng plasmid DNA and 2 units of Vent DNA 
polymerase in 100 Amplification was carried out by using a Perkin- 
Elmer/Cetus thermal cycler at 94°C for 30 sec, 48°C for 30 sec and 72°C 
for 2 min for 30 cycles. Deep Vent IVPS1 was synthesized from pNEB 
#720 (ATCC No. 68723) which has a 4.8 Kb BamHI fragment 
containing the Pyrococcus sp. DNA polymerase gene inserted into the 
BamHI site of pUCl 9. Vent IVPS2 was synthesized from pV1 53-2 
which has a 1.9 kb EcoR1 fragment (2851-4766) of the Vent DNA 
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polymerase gene sequence in the vector Bluescribe SK- (Stratagene). 
Alternatively, pNEB671 (ATCC No. 68447) can also be used for IVPS2. 
pAMQ29 is an endonuclease-deficient derivative of pV153-2, carrying 
an amino acid substitution (aspartic acid 1236 to alanine) within the Vent 
5 IVPS2 coding region. Primers 5'-AGTGTCTCCGGAGAAAGTGAGA 

T-3' (SEQ ID NO:3) (Vent IVPS2 forward, 3534-3556, a substitution of 
A3542 to C) and 5'-AGTATT GTGTACCAGGATGTTG-3' (SEQ ID NO:4) 
(Vent IVPS2/Thr reverse, 4685-4706) were used to synthesize endo+ or 
endo* Vent IVPS2 fragment (1 173 bp) with a threonine codon at its 3' 

1 0 terminus. Primers 5'-AGCATTTTACCGGAAGAATGGGTT-3' (SEQ ID 

NO:5) (DV IVPS1 forward, 1839-1862) and 5'-GCTATTATGTGCATAGA 
GGAATCCA-3'(SEQ ID NO:6) (DV IVPS1/Ser reverse, 3428-3452) were 
used to synthesize the Pyrococcus sp. (or Deep Vent) IVPS1 fragment 
(1614 bp) with a serine codon at its 3' end. Reverse primers lacking the 

1 5 final three nucleotides could be used to generate IVPS fragments 

lacking the C-terminal serine or threonine. 

The PCR samples were extracted with phenol and chloroform, 
precipitated in 0.3 M NaAc and 70% ethanol at -20°C for overnight, 

20 recovered by spinning at 10 K for 10 min in a microfuge, dried and each 

resuspended in 30 u.l of distilled water, loaded on a 1% agarose gel for 
electrophoresis at 60 volts for 15 hours. The gel slices that contain the 
PCR-amplified fragments were placed in a 1% low melting agarose gel 
for electrophoresis at 80 volts for 2 hours. DNA fragments were 

25 recovered from the low melting agarose gel by incubation in 0.5 ml of TE 

buffer (10 mM Tris-HCI/0.1 mM EDTA, pH8.0) at 65°C for 30 min, 
extractions with phenol, phenol-chloroform (1:1 mixture) and chloroform, 
precipitation in 0.6 M NaAc (pH5.2) and 50% isopropanol at -20°C for 
overnight. DNA was spun down, washed with 70% ethanol, dried and 

30 resuspended in 15.5 ^l distilled water. 

Phosphorylation of the IVPS DNA fragments was performed at 
37°C for 60 min with 2 [i\ of 10 x polynucleotide kinase buffer (NEB), 
15.5 nl of purified DNA , 2 jil 10 mM ATP, and 5 units of T4 
35 Polynucleotide kinase (NEB) in 20 nl. The samples were heated in a 
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65°C water bath for 10 min. After addition of 80 u,l of TE buffer ( 10 mM 
Tris-HCl/0.1 mM EDTA, pH8.0), the samples were sequentially extracted 
with phenol, phenol-chloroform (1 :1 mixture) and chloroform. DNA was 
precipitated in 2.5 M NH4AC and 70% ethanol at -70°C for 3.5 hours, 
pelleted by spinning at 10 K for 10 min in a microfuge, washed with cold 
70% ethanol. dried and resuspended in distilled water (20 uJ for Vent 
IVPS2 or Deep Vent IVPS1 DNA, 10 u.l for Vent IVPS endo- DNA). 

EXAMPLE 2 

IN-FRAME INSERTION OF IVPS IN A RESTRICTION ENZYME 
1 INFARIZED PLASMID. SUCH AS ON E ENCODING 
R-fi ALACTOSID ASE 

In this example, we describe how the IVPS cassettes can be 
cloned into a target gene by inserting the cassette at a restriction 
enzyme site which makes a blunt cut in the target gene between 2 
codons. The cassette can carry a C-terminal serine, cysteine or 
threonine if necessary. This protocol works best if the restriction enzyme 
cuts the target gene vector once or twice. As an example, we describe 
insertion into the EcoRV site of the lacZ gene (Figure 2). 

PREPARATION OF CC ^RV-LINEARIZED PAHQ5 

pAH05 carries the entire lacZ gene sequence on a 3.1 kb BamHI- 
Dral fragment from pRS415 (Simons, et a!., Gene 53:85-96 (1987)) 
inserted between SamHI and Smal sites in the polylinker of pAGR3 
downstream of a tac promoter. The tac promoter is a transcription control 
element which can be repressed by the product of the /ad 0 , gene and 
be induced by isopropyl B-D-thiogalactoside (IPTG). The 5.9 Kb vector 
pAH05 (NEB) also has a transcription terminator sequence upstream of 
the tac promoter and the polylinker, and the E. coli lacM gene. pAH05 
contains two EcoRV recognition sequences. EcoRV leaves blunt ends at 
its cleavage site. One of the EcoRV cleavage sites cuts within the /acZ 
coding region between the 375th codon (aspartic acid) and the 376th 
codon (isoleucine) and is planned as the site for in-frame insertion of the 
IVPS fragments. The other site is located 3.2 Kb downstream within the 
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E. co// /adQ gene. The plasmid is cut partially to produce some 
molecules in which only one of the EcoRV sites has been cleaved. 
These linear plasmids are purified. The IVPS cassettes will be randomly 
cloned into either EcoRV site. Therefore, the resultant recombinants 
must be screened for orientation and insertion into the proper EcoRV 
site. DNA was partially digested by incubation of 15 jig of pAH05 DNA 
with 40 units of EcoRV (NEB) in 100 ul of 1 x NEB buffer 2 at 37°C for 60 
min. 20 u.l agarose gel loading dye was added to the sample after the 
sample was heated to 65°C for 1 0 min to inactivate EcoRV. DNA 
fragments were separated by electrophoresis on a 1% low melting 
agarose gel. Linearized pAH05 plasmid DNA was recovered from the 
low melting agarose gel as described in example 1 and resuspended in 
44.6 u.l of distilled water. 

Dephosphorylation of EcoRV-linearized pAHOS was carried out in 
50 u.l of 1 X NEB buffer 2 at 50°C for 60 min. in the presence of 2 ug DNA 
and 4 units of Calf Intestinal Alkaline Phosphotase (NEB). The sample 
was heated in a 65°C water bath for 30 min after addition of 0.5 u.l of 0.5 
M EDTA (pH8.0) and extracted with phenol, phenol-chloroform (1:1 
mixture), and chloroform. DNA was precipitated in 0.75 M NH4Ac and 
70% ethanol for 2 hours, recovered as described in Example 1 , and 
resuspended in 20 u,l of distilled water. 

CONSTRUC TION OF IVPS-/acZ FUSION GENES 

Ligation of dephosphrylated pAH05 DNA with phosphorylated 
IVPS fragments was carried out at 16°C for 15 hours in 20 u.l volume 
with addition of 8.6 u.l distilled water, 2jal of 10 X T4 DNA ligase buffer 
(NEB), 4 u.1 of 0.1 ng/|al dephosphorylated pAH05 DNA, 5 u.l IVPS DNA 
prepared as described above (0.25 ug of Vent IVPS2, 0.4 ug Deep Vent 
IVPS1 or 0.25 ug of Vent IVPS2 endo") and 160 units of T4 DNA ligase 
(NEB). 

E. coli strain RR1 was transformed by mixing 100 ul of competent 
RR1 cells with 10 ul of ligation sample on ice for 30 min., heating at 42°C 
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for 2 min., chilling on ice for 5 min., adding 0.8 ml LB media (10 
grams/liter tryptone. 5 grams/liter yeast extract, 10 grams/liter NaCI, 1 
gram/liter Dextrose, 1 gram/liter MgCl2-6H 2 0, pH7.2 at 25°C) and 
incubating at 30°C for 45 min. The samples were plated onto LB plates, 
supplemented with 100 ng/ml ampicillin. After incubation overnight at 
30°C, about 150-300 colonies per plate were observed. 

Colony hybridization was utilized to screen for clones that carry 
recombinant plasmids. The Vent IVPS2 forward primer and the Deep 
vent IVPS1 forward primer, described in example 1 , were radio-labeled 
with (_- 32p ) ATP using T4 polynucleotide kinase and used as 
hybridization probes. Colonies were lifted onto nitrocellulose and treated 
for 5 min. in each of the following solutions: 1 0% SDS, 0.5 M NaOH/1 .5 
M NaCI, 0.5 M Tris-HCI (pH7.5)/0.5 M NaCI (twice) and 2XSSC (twice). 
The nitrocellulose filters were dried at room temperature for 1 hour, 
baked in vacuum at 80°C for 2 hours, soaked in 6 x SSC for 5 min and 
washed in a solution of 50 mM Tris-CI (pH8.0), 1 M NaCI. 1 mM EDTA 
and 0.1% SDS at 42°C for 2 hours. After treatment at 42°C for 4 hours in 
6 X NET, 5 X Denhardt's, 0.5% SDS and 25 ixg/ml of denatured salmon 
sperm DNA, the filters were incubated with the radiolabeled oligomer 
probe under the same conditions for 16 hours and then washed in 6 x 
SSC at room temperature three times for 15 min, twice at 42°C for 2 min 
and twice at 50°C for two min, followed by autoradiogram. 36 clones 
were found to hybridize to the corresponding oligomer probes. 

The positive clones were further analyzed to determine insert 
location by PCR amplification of plasmid DNA extracted from these 
clones, using the Vent IVPS2 forward primer (or the Deep Vent IVPS1 
forward primer) described in Example 1 , and a /acZ reverse primer (5'- 
AGGGTCGACAGATTTGATCCAGCG-3' (SEQ ID NO:7)) complementary 
to the lacZ coding sequence (1417-1440, with a G:T mismatch at 1437) 
392 nt downstream of the insertion site. PCR reactions from 14 clones 
produced the corresponding DNA fragments. Clones pVT133, 138. 139, 
141 , 142, and 144 contain the 1 .4 Kb Vent IVPS2 insert, and pVTE 834, 
836, 839 and 841 contain the Vent 1VPS2 (endo") insert, all yielding 
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DNA fragments of approximately 1.1 kb. Clones pDVS 712, 742, 745 
and 746 carry the 1.6 Kb Deep Vent IVPS1 insert, producing DNA 
fragments of about 2.0 Kb. 

EXPRESSION OF THE I VPS-ZacZ FUSION GENES 

The clones were further examined by their ability to express fusion 
(modified) proteins with inducer IPTG. 

The clones were cultured in LB medium supplemented with 100 
jig/ml ampicillin at 30°C until ODeoonm reached 0.5. To prepare lysate 
from uninduced cells, 1.5 ml of culture was pelleted and resuspended in 
100 \i\ of urea lysis buffer, followed by boiling for 10 min. After addition of 
IPTG to a final concentration of 0.3 mM, the cultures were grown at 30°C 
for 4 additional hours. Cells from 1.5 ml culture were pelleted and then 
lysed with 250 |il of the urea lysis buffer after induction for 2 hours and 4 
hours. Protein products were analyzed by Coomassie Blue stained gels. 
Three of the Vent IVPS2-/acZ fusion constructs (pVT139, 142 and 144) 
and all four Vent IVPS2 (endo*)-/acZ fusion constructs showed a major 
product of about 162-165 KDa, the expected size for a Vent IVPS2-B- 
galactosidase fusion protein. All four Deep Vent IVPSWacZ fusion 
clones expressed a larger product of 173-178 KDa, the expected size for 
the Deep Vent IVPS1-(3-galactosidase fusion protein. 

The identity of the Vent IVPS2 fusion proteins from pVT142 and 
144, and pVTE836 and 839 was further analyzed by western blots using 
antibody raised against l-Tli-l (NEB) or G-galactosidase (Promega). 
Samples were electrophoresed on 4-20% SDS gels (ISS, Daichi, 
Tokyo, Japan) with prestained markers (BRL). transferred to 
nitrocellulose, probed with antisera (from mouse), and detected using 
alkaline phosphate-linked anti-mouse secondary antibody as described 
by the manufacturer (Promega). A band of approximately 160 KDa from 
all four clones being examined reacts with both sera and migrates at the 
same location as the Coomassie Blue stained band. Deep Vent IVPS1 
fusions were also examined. Western blot analysis of pDVS712 and 742 
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using sera against B-galactosidase and l-Pspl (the protein product of 
Deep Vent IVPS1) yielded the predicted major band at about 168-175 
KDa, identical to the Coomassie Blue stained band. 



THERMAL CONTROL OF PROTEIN SPLICING 
IN B-GALACTOSIDASE-IVPS FUSIONS 

The constructs described above (IVPSs inserted into the lacZ 
EcoRV site) yield fusion (modified) proteins after induction. The IVPS 
protein can be excised from the fusion protein to generate a ligated 
target protein (active B-galactosidase) and free IVPS endonuclease by 
incubation at elevated temperatures. 

SPLICING IS CONTROLLABLE BY TEMPERATURE 
INDUCTION: B-GALACTOSIDASE ACTIVITY IN CRUDE 
EXTRACTS INCREASESIN RESPONSE TO TEMPERATURE 
SHIFT 

Crude extracts were prepared from cultures of RR1 (the E. coli 
host) and RR1 containing pAHOS (the non-fusion B-galactosidase parent 
plasmid described in Example 2) or the fusion constructs, pVT142 (Vent 
IVPS2 or CIVPS2), pVTE836 (Vent IVPS1 endo") or pDVS712 (Deep 
Vent IVPS1 or CIVPS 3) by the following steps. A single colony was 
inoculated in 10 ml LB medium supplemented with 100 jig/ml ampicillin, 
incubated at 30°C overnight, subcultured in 1 liter LB medium (100 
ng/ml ampicillin) at 30°C to ODeoOnm about 0.5 and induced with IPTG 
at 0.3 mM at 30°C for 2 hours. Cells were spun down and resuspended 
in 100 ml of LB, sonicated for 3 min at 4°C and spun at 7000 rpm for 15 
min. The supernatants were recovered and stored at -20°C. 

7.5 ml aliquots of crude extracts were incubated in 42°C or 50°C 
water baths; 1 ml aliquots were taken at 1 , 2 and 12 hours for pVT142 
and pVTE836 extracts or 0.5, 1 , 2, 4 and 16 hours for pDVS712, pAHOS 
or RR1 extract. 



EXAMPLE 3 
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B-galactosidase activity was measured according to Miller et al. 
(Experiments in Molecular Genetics (1972), Cold Spring Harbor, New 
York, Cold Spring Harbor Laboratory). Assay buffer was prepared by 
mixing Z buffer with 2.7 jil/ml of 2-mercaptoethanoL Substrate o- 
5 nitrophenyl-B-D-galactopyranoside (ONPG) was dissolved in the assay 

buffer at 4 mg/ml. 0.1 ml of treated or untreated extract was transferred 
into a test tube containing 0.9 ml of assay buffer and 1 drop of 0.1%SDS 
and incubated for 5 min at 28°C. 0.1 ml LB medium was used for blank. 
0.2 ml of 4 mg/ml ONPG was added to start an assay reaction. When 

10 adequate yellow color developed, the reaction was stopped by addition 

of 0.5 ml of 1 M Na2C03. The incubation time was recorded and activity 
was measured on a spectrophotometer at OD420nm an d OD55Q nm . 
The enzymatic activity from the heat-treated extract was calculated as 
follows. The activity after incubation was divided by the activity of the 

15 zero time point; the ratio was then multiplied by 100 to yield a 

percentage. Comparison of enzymatic activity indicated that while heat 
treatment had no effect on activity from RR1 or RR1/pAHOS extract in the 
first two hours of incubation, all three IVPS-LacZ fusion constructs, 
pVT142, pVTE836 and pDVS712, exhibited an increase in enzymatic 

20 activity in response to the temperature shift to 42°C from 143% to 221% 

of untreated samples (Figure 3A and 3B). This increase in B- 
galactosidase activity was due to excision of the IVPS and ligation of the 
two halves of B-galactosidase, forming more enzyme which was active. 
The splicing was confirmed by Western blot analysis. B-galactosidase 

25 activity in RR1 cells comes from expression of the chromosomal gene. 

The overnight incubation resulted in lower enzymatic activity from all 
samples, probably due to thermal inactivation of B-galactosidase (Figure 
3A and 3B). 

30 SPLICING IS CONTROLLABLE BY TEMPERATURE 

INDUCTION: ANALYSIS OF PROTEINS BY COOMASSIE 
BLUE STAINING AND WESTERN BLOTS 

Analysis of IVPS-/acZ fusion protein synthesis in RR1 cells is 
35 complicated by chromosomal expression of B-galactosidase. Therefore, 
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for ease of analysis, all the constructs were transferred to an E. coli host 
which.did not synthesize f3-galactosidase. 

Preparation of crude cell extracts from the IVPS-/acZ fusion clones 
and western blot analysis of heat-treated samples were performed as 
fallowings. 

The fusion constructs and the lacZ expression vector pAOH5 were 
introduced into a /acZ-deletion E. coli strain ER2267 (NEB) by the 
standard transformation procedure as previously described. 

The cultures of ER2267 (50 ml), ER2267/pAH05 (50 ml), pVT142 
or pDVS712 plasmid (each in 1 liter) were grown at 30°C in LB media, 
supplemented with ampicillin at 100 jig/ml for plasmid-containing cells. 
When ODeoonm reached between 0.48 and 0.55, inducer IPTG was 
added into the cultures to 0.3 mM final concentration and the cultures 
were incubated at 23°C for 3 additional hours. Cells were spun down, 
resuspended in 50 ml (for ER2267 or pAHOS-bearing ER2267) or 100 
ml (for pVT142- or pDVS712-bearing ER2267) LB media, sonicated for 3 
min at 4°C and spun at 7000 rpm for 10 min. The supernatants were 
stored at -20°C. Three 5 ml aliquots of each extract were incubated and 
sampled at 23°C, 42°C or 50°C for 16 hours. Aliquots of 0.9 ml were 
transferred into 1.5 ml microfuge tubes after incubation for 1, 2, 3, 4, 6 
hours. 5 |il of untreated or treated extract was mixed with 10 \i\ of water 
and 5 nl of 5 x sample buffer (0.31 M Tris-CI, pH6.8/10%SDS /25% 2- 
mercaptoethanol /50% glycerol/0.005% Bromophenol blue) and boiled 
for 10 min. 

5 jil of each sample was loaded on a 4/20% SDS polyacrylamide 
and electrophoresed at 100 volts for 3-4 hours. Western blots, using 
antibody raised against R-galactosidase (Promega) and antibody raised 
against endonuclease I-Tli-I or l-Pspl (NEB), were carried out according 
to the procedure of Promega. The results showed barely trace amounts 
of endonuclease present in cells after IPTG induction at 23 C C from both 
pVT142 and pDVS712 constructs, indicating inefficient excision activity, 



WO 97/01642 PCT/US96/10545 

-38- 



if any. However, after shifting the ER2267/pVT1 42 extract to higher 
temperatures, 42°C or 50°C, abundant IVPS2 product (l-Tli-l about 42 
KDa), identical to the excised endonuclease from the Vent DNA 
polymerase precursor, was accumulated (Figure 4). A similar pattern 
5 was observed for pDVS71 2/ER2267 extract treated at 42°C or 50°C 

(Figure 4), resulting in accumulation of a product of about 60 KDa, 
expected for the Deep Vent IVPS1 product, l-Pspl. 

Western blot analysis using antibody against B-galactosidase 
1 0 indicated that excision of the IVPS domains was coupled with ligation or 

rejoining of the N-domain and the C-domain of the interrupted B- 
galactosidase. The heat-treated samples of both fusion constructs 
contained a product of 1 14 KDa, identical in size to full-length B- 
galactosidase (Figure 4). However, this product was only accumulated 
1 5 'n small amount in the samples of pVT142, indicating that splicing from 

this fusion protein is inefficient under these conditions. 

The fusion proteins were further tested for their ability to splice at 
higher temperatures, up to 80°C. The initial reaction rates at different 

20 temperatures were compared. The extracts were incubated in 300 u.l 

aliquots in 1.5 ml-microfuge tubes at 42°C, 50°C, 65°C or 80°C. 20 u.l 
were taken from each heated extract sample at 15 and 30 min and 1 , 2, 
and 4 hours, and mixed with 40 u.l of water and 20 u.l of 5 x sample buffer 
and boiled for 10 min. Western blot analysis showed that Deep Vent 

25 IVPS- B-galactosidase fusion protein was able to splice at 65°C and at 

80°C, although splicing seems more efficient at 65°C as measured by 
the accumulation of the 1 14 KD product. Excision of the Vent IVPS2 was 
efficient at 65°C but seems blocked at 80°C. Lack of accumulation may 
be due to thermal denaturation and precipitation of B-galactosidase at 

30 80°C with time. 



EXAMPLE 4 

IN-FRAME INSERTION OF IVPS IN A PCR GENERATED 
35 LINEAR PLASMID. SUCH AS ONE ENCODING B-AGARASE I 
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ln Example 2, we described inserting the IVPS cassettes from 
Example 1 into a restriction enzyme linearized plasmid. This method is 
limited by the availability of appropriate restriction enzyme sites in a 
target gene. PCR amplification using opposing primers on a circular 
5 plasmid allows linearization of any plasmid at any position, limited only 

by the capacity of the PCR reaction. Once the target plasmid is linear, the 
process is essentially the same as described in Example 2 for restriction 
enzyme generated linear plasmids. 

! o As described in Example 2, insertion of an IVPS cassette into a 

target gene can be accomplished by ligation of an IVPS fragment with 
linear plasmid. In this example, PCR primers are used to generate 
plasmids linearized just prior to a serine or threonine codon. Thus, 
when the IVPS is excised and the two halves of the target protein are 

1 5 ligated, no extra amino acid is left behind in the target protein. The 

serine or threonine at the insertion site can be positioned on either the 
IVPS fragment or on the target gene fragment. If the serine or threonine 
is present on the IVPS cassette, then the target gene PCR primer can be 
constructed with a deletion of the 3 nucleotides encoding the first 

20 residue of the downstream EPS. If the IVPS cassette lacks the serine or 

threonine codon, then PCR with opposing, abutting PCR primers is used 
to synthesize target plasmid linearized at the serine or threonine codons. 

This example describes cloning two IVPS elements, Vent IVPS2 
25 and Deep Vent IVPS1 , into a gene encoding B-agarase I (Yaphe, W., 

Can. J. Microbiol. 3:987-993 (1957)) by the procedure described in 
Example 2. The Deep Vent IVPS1 is inserted in front of a serine, the 
108th codon, of the 290 amino acid 13-agarase I gene, while the Vent 
IVPS2 is inserted in front of a threonine, the 133th codon of the B- 
30 agarase I gene. 

The IVPS DNA fragments, including the serine codon (for Deep 
Vent IVPS1) or the threonine codon (for Vent IVPS2) at the 3' end, were 
prepared as described in Example 1 . pAG6a1 (NEB), a 3.8 Kb 
35 recombinant plasmid containing the B-l- arase I gene sequence in 
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vector pUCl8 in the orientation of lac promoter, was used as a PCR 
template to synthesize linear plasmid DNA fragments. Primers 
agaS108.rv (5'-GAGAACTTTGTTCGTACCTG-3' (SEQ ID NO:8)) and 
agaS108.fw (S'-GGTATTATTTCTTCTAAAGCA-S* (SEQ ID NO:9)) are 
compementary to DNA sequence 5' and 3* of the 108th codon, 
respectively. Primers agaT1 33.rv (S'-GTrGTrrGTrGGTrTTACCA-S' 
(SEQ ID NO:10)) and agaT133.fw (5'-ATGGCAAATGCTGTATGGAT-3' 
(SEQ ID NO:1 1)) are complementary to sequence 5' and 3' of the 133th 
codon, respectively. Each pair of the primers was used to synthesize 
linear plasmid DNA fragments, lacking the serine or threonine codon. 
The PCR mixture contained Vent DNA polymerase buffer (NEB), 
supplemented with 2 mM Magnesium sulfate, 400 u.M of each dNTP, 0.5 
u.M of each primer, 20 ng plasmid DNA and 2 units of Vent DNA 
polymerase in 100 u.l. Amplification was carried out using a Perkin- 
Elmer/Cetus thermal cycler at 94°C for 30 sec. 45°C for 30 sec and 72°C 
for 5 min for 30 cycles. The PCR samples were extracted with phenol 
and chloroform, precipitated in 0.3 M NaAcetate and 50% isopropanol, 
recovered by spinning at 10 Krpm for 10 min in a microfuge, dried and 
resuspended in 100 u.l of distilled water. The DNA samples were then 
electrophoresed on a 1% low melting agarose gel and PCR-synthesized 
fragments were recovered as described in Example 1. 

Ligation of PCR-synthesized fragment with phosphorylated IVPS 
fragment (Example 1) was carried out at 16°C for 12 hours in 20 nl 
volume with addition of 9.5 jat distilled water, 2 |al of 10 X T4 DNA ligase 
buffer (NEB), 4 u.l of 0.01 ng/ul PCR-synthesized plasmid DNA, 4 ^l IVPS 
DNA (0.20 ng of Vent IVPS2 or 0.32 ng Deep Vent IVPS1) and 0.5 nl of 
400,000 M/ml of T4 DNA ligase (NEB). Transformation of E. coli strain 
RR1 with the ligation samples was performed as described in Example 
2. Transformants were cultured in LB medium, supplemented with 100 
jj.g/ml ampicillin, for extraction of plasmid DNA using alkaline lysis 
method (Sambrook et al., Molecular Cloning: A Laboratory Manual 
(1989), Cold Spring Harbor Laboratory press, Cold Spring Harbor, New 
York). Plasmid DNAs were compared with pAG6a1 by electrophoresis 
on a 0.8% agarose gel followed by staining with ethidium bromide. 
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Recombinant plasmid pAG108S18 contains the Deep Vent IVPS1 insert 
while pAG133T22, 26, 31 and 35 all contain the Vent IVPS2 insert. 

EXPRESSION OF THE IVPS-B-AG AR ASE I FUSION GENES 

5 

The clones were further examined by their ability to express fusion 
proteins. RR1 cells carrying pAG108S18 or pAG133t35 were cultured in 
1 liter of a modified LB medium, lacking dextrose, supplemented with 
100 ug/ml ampicillin, at 30°C until ODeoonm reached about 0.5. After 

1 0 addition of inducer IPTG to a final concentration of 0.3 mM, the cultures 

were cooled down and grown at 25°C for 4 additional hours. Cells were 
spun down and resuspended in 50 ml LB medium. Crude extracts were 
prepared as described in Example 3. Western blots using antibodies 
raised against l-Tli-l (NEB), l-Pspl (NEB) and (3-agarase I (NEB) were 

1 5 performed to detect fusion (modified) proteins expressed from these 

clones. Samples were electrophoresed on 4-20% SDS gels (ISS, 
Daichi, Tokyo, Japan) with prestained markers (BRL), transferred to 
nitrocellulose, probed with antisera (from mouse), and detected using 
alkaline phosphatase-linked anti- mouse secondary antibody as 

20 described by the manufacturer (Promega). Both anti-l-Pspl sera and 

anti-B-agarase I sera reacted with a 90-95 KDa product expressed from 
pAG108S18/RR1 , of the expected size for a Deep Vent IVPS1 
(approximately 60 KDa) - (3-agarase I (approximately 30 KDa) fusion 
protein (Figure 5). Both anti-l-Tli-l sera and anti-B-agarase I sera reacted 

25 with a 70-75 KDa product.from pAG1 08S1 8/RR1 , approximately the size 

expected for a Vent IVPS2 (42KDa)- B-agarase I fusion protein (Figure 
5). 



30 



EXAMPLE 5 

INSERTION OF 1VPS INTO TARGET GENE BY CREATION OF 
NEW RESTRICTION ENZYMES SITES THR OUGH SILENT 

SUBSTITUTIONS 



35 in the previous examples, an IVPS cassette containing the entire 

IVPS sequence, with or without the first downstream EPS codon, was 
inserted into a blunt, linearized plasmid. It is also possible to create a 
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restriction site by silent mutations (preserving the amino acid residue) 
near the ends of either the IVPS or the target gene. 

CREATION OF A RESTRICTION SITE NEAR THE END OF 
5 THE IVPS 

It is possible to create a restriction site by silent mutations 
(preserving the amino acid residue) at both ends of an IVPS to facilitate 
insertion of the IVPS at any position within the target gene. After 

1 0 creation of the new restriction sites, the IVPS is cut with these enzymes. 

The target gene plasmid is generated by PCR. Since the restriction sites 
are within the IVPS, one must include the missing IVPS sequences on 
the 5' end of the respective target gene PCR primers to complete the 
IVPS and to generate compatible cloning sites in the target gene (Figure 

15 6). 

For example, silent mutations in Vent IVPS2 can create a BspEl 
site at the 5' end using primer Vent IVS2 Forward BspEl (5'-AGTGTCTC 
CGGAGAAAGTGAGAT-3' (SEQ ID NO:12)) and a Spel at its 3' end, by 

20 using primer, Vent IVS2 Reverse Spel (5VATTGTGTACTAGTATGTTGTT 

TGCAA-3' (SEQ ID NO:13)). It can then be inserted, for example, 
between an aspartic acid codon (residue 594) and a threonine codon 
(residue 595) within the lacZ coding region. A linear target gene 
plasmid can be generated by PCR as described in Example 4 with 

25 primers which include the BspEl and Spel sites, the remaining portion of 

the IVPS and a region with identity to lacZ using primer, /acZ1/BspE1 
reverse (5'-GCCTCCGGAGACACTATCGCCAAAATCACCGCCGTAA-3' 
(SEQ ID NO:14)) and primer, lacZ2/Spe\ forward (5'-GCCACTAGTACAC 
AATACGCCGAACGATCGCCAGTTCT-3'(SEQ IDNO:15)). DNA 

30 fragments are synthesized from both the IVPS and the target gene by 

PCR. Both IVPS and target gene primers contain the new restriction 
sites. After cutting with the appropriate restriction endonucleases, DNA 
fragments with compatible ends can then be ligated to create a fusion 
gene. Since no extra residue would be left after excision of the IVPS, 

35 native B-galactosidase polypeptide would be expected to form if splicing 

occurs. 



BN8OOCI0: <WO_8701W2A1J_> 



WO 97/01642 




PCT/US96/10545 



Insertion of IVPS at restriction sites near the insertion site. 

In another general approach (Figure 7), a restriction site near the 
insertion site in the target gene (for example, a threonine or a serine 
codon), can be used to insert an IVPS with ends compatible to the target 
gene. Restriction site(s) can be created by silent nucleotide substitution 
at or near the insertion site or native restriction sites can be used. A 
linear target gene plasmid is made by PCR as described in Example 4, 
beginning at the restriction sites near the insertion site. The IVPS is 
synthesized with primers containing the compatible restriction sites and 
the remainder of the target gene sequence (the sequence between the 
restriction site and the insertion site). The IVPS DNA fragment, with the 
ends overlapping the sequence at the insertion site, can be synthesized, 
cut with the appropriate enzyme (s), and then iigated to the vector that is 
cut by the same enzyme(s). 

For example, IVPS elements can be inserted between residue 479 
(aspartic acid) and 481 (serine) within the /acZ gene by creating a Sal\ 
site (residues 478-479) and a Xba\ site (residues 481-482 serine- 
arginine) by silent mutations. This can be achieved by PCR of the target 
plasmid, pAHOS, described in example 2, using primers, lacZZ Sal 
reverse (5'-AGGGTCGACAGATTTGATCCAGCG-3' (SEQ ID NO:7)) and 
/acZ4 Xba forward (5'-CCTTCTAGACCGGTGCAGTATGAAGG-3' (SEQ 
ID NO:16)). Next the IVPS2 fragment is generated by PCR using primers, 
Vent IVS2 Forward Sa/I (5'-GCCGTCGACCCTAGTGTCTCAGGAGAAA 
GTGAGATC-3' (SEQ ID NO:17)) and Vent IVS2 reverse Xba\ (5'-GCCTC 
TAGAATTGTGTACCAGGATGTTGTTTGC-3' (SEQ ID NO:18)). DNA 
fragments are synthesized from both the IVPS and the target gene by 
PCR. Both IVPS and target gene primers contain the new restriction 
sites. Unfortunately, this vector also contains single Xbal and Sal] sites 
(Figure 7). Therefore, the target gene vector PCR product must be cut 
under conditions which produce partial digestion. The required linear 
plasmid is then isolated from agarose gels. After cutting with the 
appropriate restriction endonucleases, DNA fragments with compatible 



WO 97/01642 




PCIYUS96/10545 



ends can then be ligated to create a fusion gene. Since no extra residue 
would be left after excision of the IVPS, native B-galactosidase 
polypeptide would be expected to form if splicing occurs. Generally, it is 
important to select or create an unique site within the target gene and 
5 vector to facilitate the cloning process as described above. 



EXAMPLE 6 



A. To facilitate experimentation on the splicing of IVPS2 in Vent DNA 
1 0 polymerase, a modified version of the T7 promoter construct 

pV174-1B1 was created. This modified version, pANG5 (Figure 
8), encodes a Vent DNA polymerase precursor identical to that of 
pV174-1B1. Numerous silent mutations were introduced to 
simplify the generation of mutants as discussed in this application, 
15 particularly at the upstream and downstream splice junctions. 

Changes included: 



1 . Destroying Xma\ and PpuMl sites in the vector backbone. 
The Xmal site was removed first by cutting the T7 expression 

20 vector pAII17 with Xmal, repairing the cohesive ends with the 

Klenow fragment of DNA polymerase I, and then religating the 
blunt termini. Plasmids were screened for resistance to cleavage 
by Xmal. The PpuMI site was similarly removed from the resulting 
vector, screening this time for resistance to PpuW cleavage. The 

25 final vector was named pAML1 . This vector allowed the use of 

unique Xma\ and PpuMl sites within the polymerase gene. 

2. Introduction of silent base changes to create restriction sites. 
Changes were introduced using oligonucleotide-directed 

30 mutagenesis as described by Kunkel (T.A. Kunkel, J.D. Roberts 

and R.A. Zakour, Methods in Enzymology 154:367-382 (1987)). 
Single-strand templates were created in two Bluescript SK- 
phagemid derivatives by superinfection with the f1 helper phage 
IR1 (Enea, etal., Virology 122:22-226 (1982)). The first contained 

35 a BsaA\ to SamHI fragment (rep isenting nucleotides 3714-5837 
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of the Vent DNA polymerase sequence) from pV174-1B1 ligated 
into BamHVEcoRV cut Bluescript. The second fragment included 
a C/al to Ssp\ fragment (nucleotides 81 6-4408) ligated into 
C/al/EcoRV cut Bluescript. 

5 

The BsaA\JBamH\ construct was mutagenized 
simultaneously with three oligonucleotides: 

5'-GCAAAGAACCG£TGCGTCTCTTC-3' (SEQ ID NO:19) {Age\ nt 
10 4669-4674) 

5*- AG C AAC AG AGIT AC CTCTTG - 3' (SEQ ID NO:20) 
(amberl 703ochre) 

! 5 5 , -CAGTTTCCAG.CTCCTACAATG.ASACCTACGAGC-3' (SEQ ID 

NO:21) (D1236A) 

where modified bases are underlined, and changes are indicated 
in parenthesis. The oligonucleotide to create D1236A also 
2o included silent base changes to create a Bsa\ site to assist in 

screening. The resulting isolate was named pAMN2. 

The Cla\/Ssp\ construct was mutagenized simultaneously 
with four oligonucleotides: 

25 

5'-GTAGTGTC£ACCCCATGCGG-3* (SEQ ID NO:22) (Sa/l nt 
3863-3468) 

5^CGTTTTGCCTfiATTATIATCTCACTTTC-3 , (SEQ ID NO:23) 
30 (BsaBI nt 3554-3563)) 

5'-GTCCACCTTC_GAA.AAAAGATCC-3' (SEQ ID NO:24) (BsB\ nt 
3608-3613) 
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5•-CCGCATAAAGGACCTTAAAGC-3 , (SEQ ID NO:25) {PpuMl nt 
3517-3523) 

where markings are as above. Screening was also as above, 
5 with the resulting construct was named pAM022. 

The BsaAl/BamHl construct was also mutagenized with the 
oligonucleotide: 

1 0 5'-GAGGAAGAGAT£ATCATCATAGC-3' (SEQ ID NO:26) (SsaBI 

blocking nt 5641) 

and screened for resistance to BsaB\ cleavage due to the addition 
of a dam methylation site. The resulting construct was named 
15 pAMW3. 

Finally, the Ndel site at the initiation codon of pV174-1B1 
was inactivated by partial Ndel cleavage, repairing the termini 
with Klenow, and recircularizing using T4 DNA ligase. Plasmids 
20 were screened for the loss of the appropriate Ndel site. One such 

construct was named pAKC4. 

The pANG5 construct was assembled from the above parts: 

25 1. Xbal/Clal from pAKC4 (translation initiation and amino 

terminus of vent DNA polymerase) 



30 



2. Clal/Ndel from pAM022 (more amino terminal polymerase 
plus the amino terminal region of IVPS2) 

3. Ndel/Nsi\ from pAMN2 (carboxyl terminal region of IVPS2, 
carboxyl terminal region of vent DNA polymerase) 

4. A/s/l/SamHI from pAMW3 (final 5 amino acids of the 
35 polymerase plus the downstream region) 
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5. BamHUXbal from pAMLl (T7 promoter, origin of replication, 
ampicillin resistance). 

Comparisons between pANG5 and the parent pV174-1B1 
show identical patterns of Vent DNA polymerase and I- 77/1 
production, with the exception of the greater viability of the pANG5 
containing strains, as discussed below. This is as expected if 
splicing occurs at the protein level, as opposed to at the RNA or 
DNA level. 

B. During work on the expression of the Vent DNA polymerase gene 
in E. coli it was found that a large increase in expression and cell 
viability occurred after deletion of IVPS1 and IVPS2. This 
increase could either represent toxic effects of I-77/II and I- 77/1, the 
gene products of 1VPS1 and 1VPS2, respectively, or toxic effects 
of the splicing reaction itself. It was reasoned that endonuclease 
and splicing activities could well be independent, allowing 
inactivation of the endonuclease without affecting splicing. A 
single amino acid substitution to A as described in the 
construction of pANG5 was made in a conserved residue within 
the amino-proximal dodecapeptide motif of I-T//I (changed residue 
D1236). Although these constructs expressed Vent DNA 
polymerase, no I-77/I activity was detected. Unlike pV174-1B1, T7 
expression strains such as BL21(DE3) tolerated pANG5 well, 
even at 37°C. Analysis of protein splicing by western blot and 
pulse-chase analysis showed no discernible differences in 
protein splicing between pANG5 and pV174-1B1, namely 
production of a full-length precursor and subsequent formation of 
the mature polymerase and a protein corresponding in size to I- 
77/1. 

C. A consensus calmodulin-dependent protein kinase II site 

(XRXXS*; Pearson et a!., supra) was constructed, replacing 
tyrosine 1079 with arginine using cassette replacement 
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mutagenesis. In short, pANG5 was cut at the unique sites BsaB\ 
and PpuW and the duplex (SEQ ID NO:27) listed below was 
inserted, introducing the desired change. 

5 5 , -GTCCTTC£aTGCGGACAGTGTCTCAGGAGAAAGTGAGATAA-3 , 
3 , -GAAQ£ACGCCTGTCACAGAGTCCTCTTTCACTCTATT-5 , 



The correct construct was verified by DNA sequencing. 



10 D. Introduction of an amber stop codon for adding a blocked amino 

acid was accomplished by cassette replacement mutagenesis in 
pANG5. For example, serine 1082 was replaced by an amber 
codon using the following duplex (SEQ ID NO:28) inserted into 
pANG5 cut with PpuW and BsaBl: 

15 

5 t -GTCCTTTATGCGGACIAaGTCTCAGGAGAAAGTGAGATAA-3 , 
3 , -GAAATACGCCTGAIQCAGAGTCCTCTTTCACTCTATT-5 , . 



Similarly, tyrosine 1472 was replaced with an amber termination 
20 codon by placing the following duplex (SEQ ID NO:29) into 

pANG5 cut with Age\ and Smal: 



S'-CCGGTTCTTTGCAAACAACATCCTGGTACACAATLMGACGGC 
S'-AAGAAACGTTTGTTGTAGGACCATGTGTTAAIICTGCCG 
25 TTTTATGCCACAATACCC-3 , 

AAAATACGGTGTTATGGG-5 , 



Finally, since the Vent DNA polymerase gene ends in an 
amber codon (TAG), that termination codon will be changed to an 
30 ochre codon (TAA) by inserting an appropriate restriction 

fragment from pAMN2 (described above) into the corresponding 
site in pANG5. 
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EXAMPLE 7 

CONTROL OF PROTEIN SPLICING BY INCORPORATION OF 
fWQ-NITRO BENZYLS SERINE AT THE SPLICE JUNCTION OF 

CIVPS2 

Two vectors were constructed using pV174.1B1 to demonstrate 
photoactivatable protein splicing. The first construct, pANY5 (also 
referred to as "wild-type"), can be described on the amino acid level as 
follows: pV174.1B1 A1-1063, A1 544-1 702, V1542M, V1541M, 
1543opal(TGA). This construct is designed to give a 55.8 kDa precursor 
protein, which splices out the 45.3 kDa endonuclease (i-77/l) and yields a 
10.5 kDa ligation product, when translated in an in vitro 
transcription/translation system. The second construct, pAOD1 (also 
referred to as the "amber mutant"), can be described on the amino acid 
level as follows: pV174.1B1 A1-1063, A1544-1702, V1542M, V1541M, 
1543opal(TGA), S1082amber(TAG). This construct is designed to give a 
2.2 kDa amber fragment under standard in vitro transcription/translation 
conditions, but will incorporate a photoactivatible serine when the in vitro 
reaction is supplemented with an amber suppressor tRNA that has been 
chemically aminocylated with o-nitrobenzylserine. With the serine at 
position 1082 "blocked", the precursor is unable to splice. When 
irradiated with intense 350 nm light, the o-nitrobenzyl group is released 
(Pillai, supra), the nuceophilic hydroxyl side chain of serine is freed, and 
the protein is able to splice. 

The amber suppresssor tRNA (lacking the 3' terminal CA residues) 
was synthesized on milligram scale by in vitro runoff transcription of 
Fo/cl-linearized pYPhe2 plasmid template with T7 RNA polymerase as 
described (Ellman, et al., supra\ Noren, et al., Nucleic Acids Res. 18:83 
(1990)). Serine derivatives protected at the a amine with functionalities 
like BPOC, CBZ, or BOC are available from commercial sources 
(Bachem, Sigma, Aldrich). N-blocked serine can be converted to N- 
blocked O-(o-nitrobenzyl) serine by a standard alkyl halide substitution 
reaction with a reagent such as o-nitrobenzylbromide. The fully blocked 
serine was then coupled to 5*-phosphc 90xyribocytidylyl-(3'-5> 
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riboadenosine (pdCpA) as described (Ellman, et al., supra). The 
aminoacyiated dimer was then ligated to the truncated suppressor tRNA 
with T4 RNA ligase (New England Biolabs, Inc.) to yield full-length 
aminoacyiated suppressor tRNA. 

In vitro transcription/translation of the "wild-type* construct was 
carried out by combining on ice: 3 jig cesium chloride-purified plasmid 
DNA, 3 1 00 mM magnesium acetate, 1 |il 100 mM calcium acetate, 7.5 
M-l low molecular weight mix (Ellman, et al., supra) (no calcium or 
methionine), 1 |il ( 3 5s)-methionine (10 (iCi/^L, 1000 Ci/mmol), 1 ^l 3 
mg/ml rifampicin, and water to 30 jiL The reactions were incubated for 3 
minutes at 37°C while an aliquot of S-30 extract prepared from E. coli 
D10 (Ellman, et al. supra) was thawed. 8.5 jil of S-30 extract was added, 
followed by 1.5 |il of T7 RNA polymerase (300 U/^iL, New England 
Biolabs, inc.), and the reactions were incubated 60 min. at 37°C. 
Samples were electrophoresed on a 10-20% tricine SDS-PAGE gel 
(NOVEX) and autoradiographed to visualize the proteins (Figure 9). 

In vitro transcription/translation of the "amber mutant" was carried 
out as described for the "wild-type" except that the reactions werre 
supplemented with 3.5 jal of chemically aminoaceylated o- 
nitrobenzylsserine-tRNA am b er at a concentration of approx. 3 |ig/|il. 

The suppressor tRNA was added to the reaction immediately before 
addition of the S-30 extract. 

Figure 9 shows a 10-20% tricine SDS-PAGE gel of in vitro 
transcription/translation reactions primed with either the "wild-type" 
(pANY5) or "amber mutant" (pAOD1) constructs. Lane 1 shows the 55.8 
kDa precursor and excised 45.3 kDa I- 77/1 endonuclease expressed in 
vitro from the "wild-type" construct. Lane 2 shows the "wild-type" 
reaction supplemented with 13.5 (ig of full length uncharged amber 
suppressor tRNA to demonstrate there is no inhibition of translation due 
to added tRNA. Lanes 3 and 4 show the result of in vitro expression of 
the "amber mutant" without and with full length unacylated supressor 
tRNA (10.5 jj.g) added. Neither of these reactions produce the full length 
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precursor molecule, nor any splice products, as expected. This ind.cates 
that the suppressor tRNA is not aminoacylated by any of the 
endogenous aminoacyl-tRNA synthetases in the cell extract. The band 
of approximate molecular weight 52 kDa is apparently caused by a 
secondary translational initiation site just downstream from the amber 
mutation. Lane 5 shows the result of supplementing the "amber mutant" 
with the chemically aminoacylated 0-nitrobenzylserine-tRNA am ber- 
Precursor protein is produced in vitro, but no splice products (i.e., I-Tffl) 
are visible. 

Controlled splicing was achieved by photochemicaily removing the 
o-nitrobenzyl group from the serine which had been incorporated site- 
specifically at position 1082 of the precursor protein. A 6 uL aliquot of an 
/n vitro reaction was treated with 0.5 ul of RNase A (10 ug/ul) to arrest 
translation, irradiated with intense (275 W) visible light from a GE model 
#RSK6B tanning lamp at 10 cm for 10 minutes, diluted with 4 ul of water, 
and then incubated at 37°C for 60 minutes to allow splicing to occur. 
The resulting splice products were visualized by electrophoresis on a 
10-20% tricine SDS-PAGE gel followed by autoradiography (Figure 10). 

Figure 10 illustrates the results of exposing the chemically blocked 
precursor (Lane 5, Figure 9) to 350 nm light. Lanes 1 through 4 are 
controls in which the "wild-type" reaction (Lane 1 , Figure 9) was treated 
as follows. Lane 1 . incubated 60 min. at 37°C; Lane 2 added 0.5 ul 
RNase (10 ug/ul) and incubated 60 min. at 37°C; Lane 3, irrad.ated 10 
minutes with 350 nm light and incubated 60 min. at 37°C; Lane 4, 
treated with RNase as above, irradiated 10 min. with 350 nm light and 
incubated 60 min. at 37°C. Lanes 5-8 show the result of treating the 
-blocked" precursor (Lane 5, Figure 9) in the same way as for Lanes 1-4, 
respectively. Irradiated of the "blocked" precursor results in the excision 
of the I-T//I (45.3 kDa) endonuclease encoded by IVPS2 (cf. Lanes 7-8 
with Lanes 5-6). 
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EXAMPLE 8 

IN-FRAM E INSERTION OF MODIFIED IVPS INTO A TARGET 
GENE AND TH ERMAL CONTROL OF PEPTIDE BOND 

CLEAVAGE 

In this example, we describe how an IVPS (CIVPS) cassette can 
be modified and inserted into a target gene. As an example, we 
describe modification of Pyrococcus sp. (or Deep Vent) IVPS1 (CIVPS3) 
by substitution or deletion of the first native downstream residue (serine), 
and in-frame insertion of the modified cassettes into the EcoRV site of 
the E. coli lacZ gene. 

MODIFICATION OF IVPS CASSETTFS 

In general, an IVPS cassette can be modified by substitution and 
deletion of residue(s) or addition of residue(s) to one or both ends of 
IVPS. The modified or fusion proteins using such modified IVPS 
cassettes may exhibit different catalytic activities, such as splicing 
(peptide ligation) or cleavage at a specific peptide bond. 

As previously discussed, the first downstream residues at the 
carboxyl splice junction are serine for Deep Vent IVPS1 (CIVPS3) and 
Vent IVPS1 or threonine for Vent IVPS2. The first IVPS residue at the 
amino splice junction of CIVPS1 , CIVPS2 and CIVPS3 is serine. 
Cysteine residues have been found at the splice junctions of the yeast 
TFP1 and M. tuberculosis RecA (See, Hirata, et al., supra; Kane, et al.. 
supra; Davis, et al., supra). It is believed that serine, threonine or 
cysteine residues at splice junctions are essential for protein splicing 
and cleavage. The previous examples have shown that an IVPS with 
the first downstream residue is sufficient to contain information for 
protein splicing. However, these residues may function differently in 
various IVPS contexts. Substitutions of the native residue, for example, 
a serine by threonine or cysteine in the Vent IVPS2 (CIVPS2) resulted in 
reduced splicing and altered cleavage activity (see, Hodges, et al., 
supra). 
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KB3PB£WH^.E8S TARGET 

CirrNE CQDQN 

5 IVPS cassettes for in-frame insertion into the /acZ coding region or 

any other target gene can be prepared by poiymerase chain reaction 
(PCR). The following protocol describes the production of four Deep 
Vent IVPS1 cassettes without or with an additional carboxyl terminal 
codon, serine, threonine or cysteine, referred as CIVPS3, CIVPS3/Ser, 

1 o CIVPS3/Thr and CIVPS3/Cys, respectively. 

Primer 5'-AGCATTTTACCGGAAGAATGGGTT-3' (SEQ ID NO:5) 
(DV IVPS1 forward, 1839-1862) and one of the four reverse primers 
described below were used to synthesize the cassettes from pNEB#720 
1 5 (ATCC No. 68723). pNEB#720 used as template has a 4.8 Kb BamHl 

fragment containing Deep Vent DNA polymerase gene inserted into the 
BamHl site of pUC19. Reverse primers 5'-GCAATTATGTGCATAGAGG 
AATCCA-3 1 (SEQ ID NO:40) and 5'-GGTATTATGTGCATAGAGGAATCC 
A-3' (SEQ ID NO:41) (3428-3452) were used to generate ClVPS3/Thr 
and and ClVPS3/Cys fragments (1614 bp), respectively. The PCR 
mixture contains Vent DNA polymerase buffer (NEB), supplemented with 
2 mM magnesium sulfate, 400 uM of each dNTP, 100 ug/ml BSA, 0.9 uM 
of each primer and 40 ng plasmid DNA and 2 units of Vent DNA 
polymerase in 100 ul. Amplification was carried out by us.ng a Perkm- 
Elmer/Cetus thermal cycler at 94«C for 30 sec. 48°C for 30 sec and 
72°C for 2 min for 20 cycles. Primer 5'-ATTATGTGCATAGAGGAATCCA 
AAG-3" (SEQ ID NO:42) (3425-3449) was used to synthesize CIVPS3 
fraament (161 1 bp) by PCR as described above except the amplification 
Tas cald out for Wcyc.es- P.mer 5'-GCTATTATGTGCATAGAGGAAT 
CCA-3* (SEQ ID NO:6) (3428-3452) were used to synthesize IVPS1/ser 
fragment (1614 bp) as previously described in Example 1. 

The PCR samples were extracted with phenol and chloroform, and 
precipitated in 0.3 uM NaAc and 50% isopropanol at -20°C for 6 hours, 
recovered by spinning at 10 Krpm for 10 min. in a microfuge, dr.ed and 
each resuspended in 20 ul of distilled water, loaded on a 1% low melting 
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agarose gel for electrophoresis at 80 volts for 6 hours. DNA fragments 
were recovered from the low melting agarose gel by incubation in 0.4 ml 
of TE buffer (10 mM Tris-HCI/0.1 mM EDTA, pH 8.0) at 65°C for 30 min., 
extractions with phenol and chloroform, precipitation in 0.3 u.M NaAc 
(pH5.2) and 50% isopropanol at -20°C for overnight. DNA was spun 
down, washed with 70% ethanol, dried and resspended in 10 nl distilled 
water. 

Phosphorylation of the IVPS1 DNA fragments was performed at 
37°C for 60 min. with 4 u.l of 10 x polynucleotide kinase buffer (NEB), 31 
|il of purified DNA, 4 nl 10 mM ATP, and 10 units of T4 polynucleotide 
kinase (NEB) in 40 jil. The samples were heated in a 65°C water bath 
for 10 min. After addition of 80 nl of TE bffer (10 mM Tris-HCl/0.1 mM 
EDTA, pH 8.0), the samples were sequentially extracted with phenol and 
chloroform. DNA was precipitated in 2.4 uM NH4AC and 70% ethanol at 
-70°C overnight, pelleted by spinning at 10 Krpm for 10 min. in a 
microfge, washed with cold 70% ethanol, dried and resuspended in 20 
|il distilled water. Phosphorylation of the CIVPS3/Ser fragment was as 
described above. 

IN-FRAME INSERTION OF CIVPS3 CASSETTES INTO THE 
ECORV SITED OF THE E. COLHacZ GENE IN VECTOR 
PAH05 

PCR-synthesized CIVPS cassettes can be inserted into a target 
coding region by ligation with linearized vector bearing the target gene. 
Linear plasmid vector can be prepared by restriction enzyme or PCR 
synthesis as previously described. pAH05 carries the entire lacZ gene 
sequence on a 3.1 kb BamH\-Dra\ fragment from pRS415 (Simons, et 
al., Gene, 53:85-96 (1987)) inserted between BamH\ and Smal sites in 
the polylinker of pAGR3 (NEB) downstream of a tac promoter. The tac 
promoter is a transcription control element which can be repressed by 
the prodct of the /adQ gene and be induced by isopropyl B-D- 
thiogalactoside (IPTG). pAH05 contains two EcoRV recognition 
seqences. EcoRV leaves blunt ends at its cleavage site. One of the 
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EcoRV cleavage sites cuts within /acZ coding region between the 375th 
codon (aspartic acid) and the 376th codon (isoleucine). 

DNA was partially digested by incubation of 1 5 jig of pAH05 DNA 
5 with 40 units of EcoRV (NEB) in 1 00 uj of 1 x NEB bffer 2 at 37°C for 60 

min. 20 |xl agarose gel loading dye was added to the sample after the 
sample was heated to 65°C for 10 min. to inactivate EcoRV. DNA 
fragments were separated by electrophoresis on a 1% low melting 
agarose gel. Linearized pAH05 plasmid DNA was recovered from the 
10 low melting agarose gel as described in Example 8 and resuspended in 

distilled water. 

CONSTRUCTION OF CIV PS-/acZ FUSION GENES 

! 5 Construction of ClVPS3/Ser-/acZ fsion was described in Example 

2. CIVPS3-/acZ fusion was made by ligation of dephosphorylated 
pAH05 DNA to the phosphorylated IVPS1 fragment. The reaction was 
carried at 16°C for 5 hours in 20 \i\ volume with 1X T4 DNA Hgase bffer 
(NEB), 0.1 ug pAH05 DNA, 0.5 ug IVPS1 DNA and 160 units of T4 DNA 

20 Ngase (NEB). E. coli strain RR1 was transformed by mixing 100 u.1 of 

competent RR1 cells with 10 ul of ligation sample on ice for 30 min., 
heating at 42°C for 2 min., chilling on ice for 5 min., adding 0.8 ml LB 
media (10 grams/liter tryptone, 5 grams/liter yeast extract, 10 grams/liter 
NaCI, 1 gram/liter dextrose, 1 gram/liter MgCl26H20, pH 7.2 at 25°C) 

25 and incubating at 30°C for 45 min. The samples were plated onto LB 

plates, supplemented with 100 ug/ml ampicillin. After incubation 
overnight at 30°C, abot 150-300 colonies per plate were observed. 

CIVPS3/Thr-/acZ and ClVPS3/Cys-/acZ fusions were made by 
30 ligation of 0.1 ug EcoRV-linearized pAH05 DNA with 0.7 g of 

ClVPS3/Thr or CIVPS3/Cys fragment. Transformation of E. coli strain 
ER2252 (NEB) was carried out by the same protocol as described 
above. 
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Colony hybridization was utilized to screen for clones that carry 
recombinant plasmids. The Deep Vent CIVPS3 forward primer, 
described above, was radio-labeled with T4 polynucleotide kinase and 
used as a hybridization probe. Colonies were lifted onto nitrocellulose 
5 and treated for 5 min. in each of the following soltions: 10% SDS, 0.5 M 

NaOH/1.5 M NaCI, 0.5 M Tris-HCI (pH 7.4)/0.5 M NaC! (twice) and 
2XSSC (twice). The nitrocellulose filters were dried at room temperature 
for 1 hour, baked in vacuum at 80°C for 2 hours, soaked in 6 x SSC for 5 
min. and washed in a solution of 50 mM Tris-HCI (pH 8.0), 1 M NaCI, 1 
10 mM EDTA and 0.1% SDS at 42°C for 2 hours. After treatment at 42°C 

for 4 hors in 6 X NET, 5 X Denhardt's, and 0.5% SDS, the filters were 
incubated with the radiolabeled oligomer probe under the same 
conditions for overnight and then washed in 2 x SSC for times at 42°C 
for 15 min. and twice at 50°C for two min., followed by autoradiography. 

15 

The positive clones that hybridized to the oligomer probe were 
further examined by their ability to express fusion proteins with inducer 
IPTG. The clones were cultured in LB medium supplemented with 100 
^g/ml ampicillin at 30°C until ODeoonm reached 0.5. After addition of 

20 IPTG to a final concentration of 0.3 mM, the cultures were grown at 30°C 

for 4 additional hours. Crude ly sates were prepared by boiling 0.1 ml of 
cells with 0.1 ml of the urea lysis buffer for 10 min. The identity of the 
fusion proteins from the positive clones described above was analyzed 
by Western blots using antibody raised against (3-galactosidase 

25 (Promega) or I-Pspl (the protein product of Deep Vent CIVPS3, NEB). 

Samples were electrophoresed on 4-20% SDS gels (ISS, Daichi, 
Tokyo, Japan) with prestained markers (BRL), transferred to 
nitrocellulose, probed with antisera (from mouse), and detected using 
alkaline phosphate-linked anti-mouse secondary antibody as described 

30 by the manufacturer (Promega). Deep Vent CIVPS3-/acZ fusion clones 

expressed a product, reacting with both antibodies, of 173-178 KDa, the 
expected size for the CIVPS3-B-galactosidase fusion proteins (Figure 
11). Clones pDV7 and pDV1 5 contain CIVPS3 insert. pDVC302, 306 
and 307 carry the CIVPS3/Cys cassette while pDVT319, 321, 322 and 
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323 contain the CIVPS3/Thr cassette. pDVS712 and 742 containing the 
CIVPS3/Ser insert were previously described in Example 2. 

tmprmal CONTROL OF SPECIFIC PEPTIDE BOND 
5 CLEAVAGE IN CIVPS3-B-GALACTOS1DASE FUSIONS USING 

fl l OniFIED C IVPR3 CASSETTES 

The DV1VPS1 (CIVPS3)-(3-galactosidase fusions containing 
cassettes with a threonine or cysteine to substitute the serine at the 
1 o carboxyl termini exhibit thermal-controllable cleavage at a specific 

peptide bond in the fusion proteins. The constructs described above 
(CIVPS3 cassettes inserted into the /acZ EcoRV site) yield fusion 
proteins after induction by IPTG. Cell extracts prepared from cells grown 
at 25°C were treated at elevated temperatures (42°C or 65°C) and 

1 5 analyzed by Western blots using antibody against B-galactosidase 

(Promega) or l-Pspl (the product of Deep Vent CIVPS3) (Figures 11 and 
12) The IVPS1/Ser fusion protein can undergo protein splicing to 
generate a ligated protein and free IVPS endonuclease by incubation at 
elevated temperatures. While no ligation activity was observed, the 

20 fusion proteins with the CIVPS3n"hr or CIVPS3/Cys cassette cleave 

dominantly at the amino splice jnction at 42°C and both fuuion proteins 
exhibit increased cleavage activity at the carboxyl splice jnction at 65°C. 

Preparation of cell extracts from the CIVPS3-/acZ fusion clones 
25 were performed as follows. All the fusion constructs originally 

constructed in different E. coli hosts were introduced into a /acZ-deletion 
E coli strain ER1991 (New England Biolabs, Inc.), which did not 
synthesize B-galactosidase, by the standard transformation procedure 
as described in Example 8. A single colony from pDV7, pDVC302, 
30 pDVT332 or pDVS712 clone was inoculated in 1.5 ml LB medium 

supplemented with 100 ^g/ml ampicillin. incubated at 30°C until 
OD 6 00nm reached about 0.5 and induced with 0.3 mM IPTG by add.ng 
1 5 ml of 0.6 mM IPTG, 100 ng ampicillin/m! LB at 25°C for 5 hours. 3 ml 
of cells were spun down and resuspended in 0.5 ml of LB, sonicated for 
35 1 min. at 4°C and spun at 6,000 rpm for 5 min. at 4°C. The supernatants 

were recovered and stored at -20°C. 
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The ceil extracts were heat-treated at 42°C or 65°C after being 
quickly thawed at room temperature. The untreated control sample was 
prepared by mixing 48 (il of extract with 12 jil of 5 x sample buffer (0.31 
5 Tris-HCI, pH 6.8/10% SDS/25% 2-mercaptoethanol/50% 

glycerol/0.005% Bromophenol ble), followed by boiling for 10 min. 
Aliquots of 48 |al were transferred into 1 .5 ml microfuge tubes and 
incubated for 30, 60, 120, or 240 min. in a 42°C water bath, or 15, 30, 60 
or 1 20 min. in a 65°C water bath. Each was mixed with 1 2 jil of 5 x 
10 sample buffer and boiled for 10 min. 

The treated samples were analyzed by Western blots using 
antibodies raised against l-Pspl (NEB) (Figure 11 and 12) or I3- 
galactosidase (Promega) (Figure 12), 5 [i\ of each sample was loaded 
15 on 4/20% SDS polyacrylamide gels (ISS, Daichi, Tokyo, Japan) and 

electrophoresed at 100 volts for 4 hours. Western blots were carried out 
according to the procedure of Promega. 

The results show that fusion protein precursors were the dominant 
20 species and barely trace amounts of l-Pspl endonuclease were present 

in cells after IPTG induction at 25°C from all for fusion constructs, 
indicating inefficient splicing and excision activity at low temperature. 
However, after shifting the pDVS712 (CIVPS3/Ser-B-galactosidase 
fusion) extract to higher temperatures, 42°C or 65°C, abundant CIVPS3 
25 product, l-Pspl, (of about 60 KDa) accumulated (Figures 11 and 12). 

Excision of the IVPS domains was coupled with ligation of the N-domain 
and the C-domain of the interrupted B-galactosidase, producing a 
product of 116 KDa, identical in size to full-length B-galactosidase 
(Figure 12). Another major product (IVPS1-C-EPS) of about 130 KDa 
30 (corresponding to cleavage at the amino splice junction) was observed. 

The fusion proteins of the other three variants (with CIVPS3, 
ClVPS3/Cys and CIVPS3/Thr cassettes) were more stable at low 
temperature. Very little l-Pspl or other products corresponding to 
35 cleavage at splice junctions were detected from the untreated extracts 
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(Figure 11). In contrast to the CIVPS3/Ser fusion, no ligated proteins 
were observed from the heat-treated samples of these three fusion 
constructs (Figure 12). The pDV7 (CIVPS3-l3-galactosidase fusion) 
sample produced only trace amounts of l-Pspl and products 
corresponding to cleavage at single splice junctions at 65°C, indicating 
poor excision at either splice junction (Figure 1 1 , lanes 1-3). pDVC302, 
containing ClVPS3/Cys cassette, showed accumulation of moderate 
amounts of l-Pspl and CIVPS3-C-EPS species at 42°C (Figure 11. lane 
5). The yield in l-Pspl, C-EPS and a product (N-EPS-CIVPS3) of about 
110 KDa, corresponding to cleavage at the carboxyl splice junction, was 
increased at 65°C while CIVPS-C-EPS species is reduced (Figure 11. 
lanes 4-6; Figure 12). The results indicate that the peptide bond 
cleavage at the carboxyl splice junction from the fusion protein and/or 
CIVPS-C-EPS product was enhanced. pDVT321 (with CIVPS3/Thr 
cassette), when treated at 42°C, showed very little l-Pspl or C-EPS but a 
dominant product, C1VPS3-C-EPS (Figure 11. lane 8; Figure 12). The 
data indicates efficient cleavage of the peptide bond at the amino splice 
junction but not at the carboxyl splice junction at 42°C. The 
accumulation of small amount of l-Pspl at 65°C indicated that cleavage 
at the carboxyl splice junction is enhanced (Figure 1 1 . lane 9). 

In summary, the data has demonstrated that by substitution of a 
single native reside, serine, at the carboxyl splice junction of the Deep 
Vent IVPS1 (CIVPS3), processing of the fusion proteins is altered and 
can be better controlled by temperature. The CIVPS3/Thr-G- 
galactosidase fusion protein (and CIVPS3/Cys fusion protein at a lesser 
extent) efficiently cleaved the specific peptide bond at the amino splice 
junction only at elevated temperatures. 
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EXAMPLE 9 

CONSTRUCTIO N AND PURIFICATION OF MIP 

PURIFICATION OF CIVPS FUSIONS BY AFFINITY 
CHROMATOGRAPHY CLONING OF THE DEEP VENT IVPS1 
INTO AN MBP FUSION PROTEIN __ 

In one embodiment of the present invention a three-part fusion can 
be generated comprising a CIVPS; a segment which can be easily 
purified, e.g., a binding protein; and a protein or peptide of interest, i.e. t a 
target protein. The order of these parts can be varied. The advantage of 
such a fusion is that it can be easily purified. Once the precursor protein 
is purified, the peptide of interest can be separated from the fusion by 
unidirectional protein cleavage induced by a modified CIVPS. In 
previous Examples, we have shown that if one of the CIVPS junctions is 
modified to reduce or prevent splicing or cleavage at that junction, then 
cleavage at the other junction will be favored over splicing (see, pages 
13 and 14 and Example 8). This allows for separation of the peptide of 
interest away from the fusion. 

This Example demonstrates that such a 3-part fusion composed of 
a binding protein, maltose binding protein (MBP), CIVPS3 and a 
paramyosin peptide can be easily purified on an amylose resin as an 
unspliced precursor. The precursor can then be induced to splice, in this 
case by thermal activation. In this Example, no attempt has been made 
to limit cleavage to one side of the CIVPS so as to interfere with splicing 
to generate only cleavage products without ligation. 

SYNTHESIS OF DEEP VENT IVPS1 INSERT (CIVPS3) 

A CIVPS3 cassette was synthesized by PGR as described in 
previous Examples, with the following modifications. The PCR mixture 
contained Vent DNA polymerase buffer (NEB), 200M of each dNTP, 
10pmoles of each primer, 40ng of plasmid DNA and 2 units of Vent DNA 
polymerase in 1001. Amplification was carried out using a Perken- 
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Elmer thermal cycler at 94°C for 30 sec, 50°C for 30 sec and 72°C for 2 
min for 20 cycles. Deep Vent IVPS1 was synthesized from pNEB #720. 

The forward primer was, Primer 96-6, 5'-GGTACCCGTCGTGCTAG 
CATTTTACCGGAAGAATGGGTACCA-3'(SEQ ID NO:43), consisting of 
26/27 bases at the 3' end which are identical to the 5' end of DV IVPS1 , 
including 2 flanking Kpn\ sites. The 3' Kpn\ site includes a silent 
substitution which creates the restriction site without changing the amino 
acid residue. Deep Vent IVPS1 reverse primer, Primer 96-7, 5'-CCC 
GCTATTATGTG CAT AG AG G G ATC C- 3' (SEQ ID NO:44) has a BamH\ 
site at the 3* end. 23/24 bases at the 3' end are homologous to the 3' 
end of DV IVPS1 , with a single base substitution to create the SamHI 
site. Primers 96-6 and 96-7 were used to synthesize the Deep Vent 
IVPS1 cassette (1.6kb). 



The PCR sample was mixed 1 :1 with chloroform and the top 
aqeous layer was loaded on a 1% low melt agarose gel for 
electrophoresis. The 1.6 kb band was excised from the gel and 
incubated at 65°C. After the gel melted, 0.25ml TE buffer (10mM Tns- 
20 HCI/0 1 mM EDTA, pH7.5) at 65°C was added and the sample was 

phenol-chloroform extracted (1 :1 mixtre). The DNA was precipitated in 
0.5M NaCI and 2 volumes isopropanol at -20°C for 30 min. The DNA 
was spun down, dried and resuspended in 60u.l TE bffer. 

PREPARATION OF P PR1002, A P MAL-c2-PARAMYOS1N ASal 



pPR1002, a pMAL-c2-paramyosin ASal fusion plasmid, is a 7.2 kb 
vector that contains a tac promoter driven malE gene linked to an EcoRI- 
Sa/I fragment of the D. immitis Paramyosin gene, referred to as the 
paramyosin ASal deletion (Steel, et a!., J. Immnology. 145:3917-3923 
(1990)) Two samples of 4 |ig each of pPR1002 were linearized with 6 
units of Xmn\ (NEB) in 20 ^l of 1X NEB bffer #2 containing 100 ug/ml 
BSA at 37°C for 2 hours. The reactions were loaded onto a 1% low 
melting agarose gel. The 7.2 kb band was excised and purified from the 
gel as above, and resuspended in 40^1 of TE buffer. 



WO 97/01642 




PCT/US96/10545 



CONSTRUC TION OF pMIP17 

Ligation of pPR1002 and Deep Vent IVPS1 was carried out at 
16°C for 16 hours in a 25pJ volume with addition of 14.5pJ distilled water, 
2.5|il of 10X T4 DNA ligase buffer (NEB), 1 |ag/jxg of cleaved pPR1002 
DNA ( 5|il of 0.2jig/jal Deep Vent IVPS1 prepared as described above 
and 800 units of T4 DNA ligase (NEB). 

E. coli strain ER2252 was transformed on ice for 5 min. by mixing 
100pJ of competent ER2252 cells with 5(il of ligation sample in 100^1 of a 
1:2 mix of O.IMCaCI^ and 1XSSC (0.1 5M NaCI, 15mM NaCitrate), 
heating at 42°C for 3 min., chilling in ice for 5 min, adding 0.1 ml LB 
media (10 grams/liter tryptone, 5 grams/liter yeast extract, 10 grams/liter 
NaCI, 1 gram/liter Dextrose, 1 gram/liter MgCl2'6H20, pH7.2 at 25°C) 
and incubating for 30 min. at 30°C. 300 \i\ of transformed cells were 
pelleted and resuspended in 100^1 supernatant and plated onto an LB 
amp plate. After incubation overnight at 30°C, about 160 colonies were 
observed. 

PGR amplification was utilized to screen for colonies that carried 
recombinant plasmids. Individual colonies were picked into 100|il of 
distilled water in a 96 well microtitre dish, and boiled for 5 min to lyse the 
cells. The PCR mixture contained Vent DNA polymerase buffer (NEB), 
200|iM of each dNTP, 10pmoles of each primer (same as above), 2.5|il 
of cell lysate and 2 units of Vent Exo" DNA polymerase in a 50p.l 
reaction. Amplification was carried out by using a Perkin-Elmer thermal 
cycler at 94°C for 30 sec, 50°C for 30 sec and 72°C for 2 min for 30 
cycles. 10|il of each reaction was run on a 1% agarose gel. The 
positive clones had bands corresponding to IVPS1 (1.6kb) and one 
positive plasmid was designated pMIP17. 
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EXPRESSION OF MIP: THE MBP-DEEP VENT IVPS1- 
pARAMYOSIN ASal FUSION — 

Positive clones containing pM!P17 were cultured in LB media 
5 supplemented with I00ng/ml ampiciliin at 30°C until OD 6 00nm reached 

0.5. To prepare a lysate from uninduced cells, 1 .0ml of culture was 
pelleted and resuspended in 50ul Protein sample buffer (125mM Tris, 
700mM B-mercaptoethanol, 2% SDS, 15% glycerol and 1mg/ml 
Bromophenol Blue). Samples from induced cultures were prepared as 
1 o follows. After addition of IPTG to a final concentration of 1 mM, the 

cultures were grown at 30°C for 20 additional hours. Cells from 0.5 ml 
culture at 5 hours and 20 hours after induction were pelleted and then 
resuspended in 100ul 5X protein sample buffer. The pre-induction and 
5-hour samples were frozen at -20°C for 16 hours and the 20-hour 
1 5 sample was frozen at -70°C for 1 5 minutes. To improve precursor yield, 

cultures were induced at 12°C-20°C and amounts of precursor 
determined by Coomassie Blue stained gel. All the samples were boiled 
for 5 minutes and the protein products were analyzed by electrophoresis 
in SDS-PAGE followed by Coomassie Blue staining or Western blots 
20 using antibody raised against l-Pspl (NEB). The samples were 

electrophoresed on 4-20% SDS gels (ISS, Daichi, Tokyo, Japan) with 
prestained markers (BRL), transferred to nitrocellulose, probed with 
antisera (mouse anti-l-Pspl), and detected using alkaline phosphate- 
linked anti-mouse secondary antibody as described by the manufacturer 
25 (Promega). A predicted major band at about 132kDa was observed in 

both the Coomassie Blue stained gels and Western blots (data not 
shown). 

LARGE SCALE PURIFICATION OF THE MBP-DEEP VENT 
30 IVPS1 PARAMYOSIN ASal FUSION ON AMYLOSE AND 

E/IQNQ Q HOLUMNS 



Single colonies were used to inoculate 4x1 0ml LB media 
supplemented with 100ng/ml ampiciliin and incubated at 30°C until 
35 OD 60 0nm reached 0.5. These cultures were used to inoculate 4x1 litre 

LB media supplemented with I00^g/ml ampiciliin and incubated at 30°C 
until OD 60 onm reached 0.5. The cultures were then transferred to 12°C 
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and induced with ImM IPTG overnight. The cells were pelleted and 
resuspended in column buffer (20mM NaP0 4 pH7.4, 200mM NaCI and 
1mM EDTA), sonicated, spun down and the cleared culture lysate 
loaded over amylose resin (NEB Protein fusion and purification system). 
Fusion protein was eluted with maltose (as described by the 
manufacturer) and examined on an SDS-PAGE gel (Figure 13A and 
13B). The amylose resin elute was further purified by chromatography 
on FPLC MonoQ anion exchange resin (Pharmacia). The column was 
washed with 0.2 M NaCI, 10 mM Tris-HCI, pH8.5 and eluted with a linear 
gradient of NaCI from 0.2 to 1 .0 M in 10 mM Tris-HCI, pH8.5. Protein 
eluted between 0.4-0.6M NaCI. 



Six protein bands were identified by Western blot with antibodies 
to MBP, l-Pspl and paramyosin. Two bands of apparent molecular mass 
180kDa and 132kDa reacted with all three antibodies. The full length 
precursor should be 132kDa. The higher molecular weight band is 
thought to be a splicing intermediate and similar high molecular weight 
species have been seen with all CIVPS constructs. The excised \-Pspl 
ran at 60kDa and was only recognized by the l-Pspl antibody, and the 
spliced product (MBP-Paramyosin ASal, 72kDa) was only recognized by 
sera reactive with the MBP and Paramyosin antibodies. A band of 
approximately 103 kDa reacted with only the MBP and l-Pspl antibodies 
and represents the product of a single cleavage at the C terminus of the 
IVPS. A band of approximately 89kDa reacted with only the l-Pspl and 
Paramyosin antisera and represents the product of a single cleavage at 
the N terminus of the IVPS (Figure 13A and 13B). 

EXCISION AND LIGATION OF THE MBP-DEEP VENT 
IVPS1 -PARAMYOSIN ASal FUSION 

Amylose resin and MonoQ preparations containing several MIP- 
related polypeptides, including precursor (132kDa), slowly migrating 
species (180kDa apparent molecular mass), products of cleavage at a 
single splice junction (103kDa and 89kDa), and small amounts of 
spliced and excised products (72kDa and 60kDa) were heat-treated at 
37°C for 2 hours in 20 mM sodium phosphate (pH6.0) and 0.5 M NaCI. 
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The 132kDa precursor and 180kDa slowly migrating species decreased 
with time, while both the 72kDa spliced product and the 60kDa excised I- 
Psp\ increased (Figure 13A and 13B). 

These results indicate that not only is it possible to purify 3-part 
CIVPS fusions, but that it is also possible to obtain single cleavage 
products. Further manipulation of the CIVPS junctions can favor 
cleavage at either splice junctions without ligation. 

EXAMPLE 10 
^np iFinATiON hp nivPfi in M)P FUSIONS 

CONSTRUCTION OF MIP WITH REPLACEABLE SPLICE 
^INfmON C ASSETTES . 



In this Example, an MIP fusion (see Example 9) with replaceable 
cassettes at both splice junctions and modification of the CIVPS by 
cassette substitution was constructed. We also show in two cases that 
modified CIVPSs are capable of cleavage at predominantly a single 
20 splice junction in a thermal inducible manner. 

In Example 9, we described a three part fusion. MIP, that can be 
generated with the following properties: a CIVPS, a binding domain 
which can be easily purified (MBP) and a gene of interest (Paramyosin 
25 ASal). Splicing of the purified fusion protein yielded two major products, 

the ligated protein domains, MBP-paramyosin ASal, and the excised 
CIVPS (or l-Pspl). We reasoned that some modifications in the CIVPS 
may result in inhibition of the ligation reaction and enhancement of 
cleavage at one splice junction. This would result in separating the 
30 peptide of interest from the fusion protein by cleavage at a specific 

peptide bond catalyzed by a modified CIVPS. In Example 8, we have 
shown that cleavage at one splice junction can be enhanced by 
modification of CIVPS3 (substitution of the C-terminal Ser by Thr or Cys) 
and that these changes reduce or prevent splicing or cleavage at the 
35 other junction. In order to screen for modifications with favorable 

properties of controllable splicing or cleavage activity, it is necessary to 
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introduce and analyze various mutations at the splice junctions. This 
could be accomplished by synthesis of the entire CIVPS cassette 
carrying each modification. However, this is likely to introduce extra 
mutations during PCR. 

5 

We have developed a strategy to facilitate the process by replacing 
only a short stretch of DNA around the splice junctions. In this Example, 
we describe how the original MIP fusion of Example 9 has been 
modified to contain two unique restriction sites flanking each splice 

10 junction. In a cassette replacement, following restriction digestion, the 

short stretch of DNA between the two unique restriction sites at one of 
the splice junctions can be replaced by another short DNA cassette. In 
this example, we modified the pMIP17 fusion described in Example 9 to 
contain two unique restriction sites at each junction: a Xho\ site and a 

15 Kpn\ site flanking the amino splice junction and a BamHI site and a Stu\ 

site flanking the carboxyi splice junction (see Figure 14). 

The MIP fusion with splice junction cassettes is constructed in two 
steps. First, the SamHI and Stu\ sites were introduced as follows. 4 jig of 

20 pMIP17 (Example 9) was digested in 1x EcoRI buffer (NEB) with 0.5 

units of EcoRI (NEB) in 50 \i\ at 37°C for 10 min. After electrophoretic 
separation in an 1% agarose gel, linearized pMIP17 plasmid DNA (8.8 
Kb) was purified by using a Genecleanll kit (BIO101). The purified 
pMIP17 DNA was digested in 1x BamHI buffer (NEB) supplemented with 

25 100 p.g/m! BSA, 40 units of SamHI (NEB) at 37°C for 3 hours and then 

extracted with phenol and chloroform. DNA was precipitated in 0.3 M 
NaAcetate (pH5.2) and 50% 2-propanoi at -20°C for 2 hours. DNA was 
recovered by spinning for 10 min at 10,000 rpm in a microfuge, dried 
and resuspended in 20 \i\ sterile water. 

30 

Prior to ligation with the vector, two complementary oligomers, 
MIP301F (5'-GATCCCTCTATGCACATAATTCAGGCCTC-3' (SEQ ID 
NO:46)) and MIP302R (S'-AATTGAGGCCTGAATTATGTGCATAGAG 
0-3' (SEQ ID NO:47)) were allowed to anneal to form a double-stranded 
35 linker, MIP301 F/MIP302R. 50 pmols of oligomers MIP301F and 
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MIP302R were incubated in 1 x T4 DNA ligase buffer (NEB) at 68°C for 
15 min and slowly cooled to 20°C-30°C. 1 ug of EcoRI-BamHI-digested 
pMIP17 DNA was ligated at 16°C for 14 hours in 35 1x T4 ligase 
buffer (NEB) with 80 units of T4 DNA ligase (NEB) and 25 pmols of the 
5 linker MIP301F/MIP302R. 

The resulting construct was termed pMIP18. The upstream Xho\ 
and Kpn\ sites were introduced into pMIP18 as follows. 2 u.g of pMlP18 
was digested at 37°C for 4 hours in 100 \i\ of 1x Buffer 2 (NEB), 100 
1 0 u.g/ml BSA and 20 units of Kpn\ (NEB). Following electrophoretic 

separation, linear pMIP18 DNA was purified by using the Genecleanll kit 
(B1O101). Prior to ligation with the vector, two complementary oligomers, 
MIP521 F (5'-GCTCGAGGCTAGCATTTTACCGGAAGAATGGGTAC-3' 
(SEQ ID NO:48)) and MIP522R (5'-CCATTCTTCCGGTAAAATGCTAG 
1 5 CCTCGAGCGTAC-3' (SEQ ID NO:49)) were allowed to anneal to form a 

double-stranded linker, MIP521F/MIP522R. 50 pmols of oligomers 
MIP301F and MIP302R were incubated in 1x T4 DNA ligase buffer 
(NEB) at 75°C for 15 min and slowly cooled to 20°C-30°C. 0.2 u.g of 
digested pMIP18 was ligated at room temperature for 3 hours in 35 ul of 
20 1 x T4 DNA ligase buffer (NEB), 80 units of T4 DNA ligase (NEB) and 25 

pmols of the linker MIP521F/MIP522R. In each case, the ligated DNA 
samples were used to transform E. coli strain ER2252 (NEB). The final 
construct, pMIP21 , contains two unique restriction sites at each splice 
junction. There is a Xho\ site and a Kpn\ site surrounding the N-terminal 
25 splice junction and a BamHl site and a Stu\ site surrounding the 

C-terminal splice junction (Figure 14). 

Western blot analysis was performed to examine expression of 
modified MIP21 fusion protein and splicing activity. ER2252 containing 

30 pMIP21 was cultured at 30°C in LB medium supplemented by 100 ug/ml 

ampicillin until OD 600 nm reached 0.5. The culture was then induced by 
1 mM IPTG at 30°C for 3 hours. 4.5 ml of the culture was pelleted, 
resuspended in 0.5 ml LB medium and sonicated on ice. The cleared 
supernatant was electrophoresed on a 4/20% polyacrylamide gel at 100 

35 volts for 4 hours. A Western blot was p ;bed with anti-MBP sera. The 
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results indicate that splicing activity from the modified MIP21 fusion was 
indistinguishable from that of MIP17. 

MODIFICATION OF MIP21 BY SPLICE JUNCTION CASSETTE 
REPLACEMENT 

In the modified MiP fusion construct, pMIP21 , the amino splice 
junction cassette includes 8 amino acid residues between the Xho\ and 
Kpn\ sites and the carboxyl splice junction cassette contains a sequence 
coding for 6 amino acid residues between the BamH\ and Stu\ sitas. 
Splice junctions can be changed by replacing either the N-terminal 
Xho\~Kpnl cassette or the C-terminal BamH\-Stu\ cassette. In the case of 
the C-terminal cassette replacement, pMIP21 is first digested with BamHl 
and Stu\. Complementary primers containing desired mutations are 
substituted for the original BamH\-Stu\ cassette. In this Example, two 
different junction cassettes were substituted for the MIP21 BamH\-Stu\ 
cassette. 

In the following cassette replacement examples, we substituted 
Alas35 by Lys or HiS536 by Leu. 

Complementary oligomers MIP303F (S'-GATCCCTCTATAAGCAT 
AATTCAGG-3' (SEQ ID NO:50) and MIP304R ( 5'- C CTG AATTATG CT 
TATAGAGG-3' (SEQ ID NO:51)) were used to substitute residue Alas35 
by Lys. Complementary oligomers MIP31 1F (5'-GATCCCTCTATGCACT 
G AATTCAGG-3 1 (SEQ ID NO:52)) and MIP312R (5'-CCTGAATTCAGTG 
CATAGAGG-3' (SEQ ID NO:53)) were used to substitute HiS5 36 by Leu. 
These two pairs of complementary oligomers were treated as described 
above to form a double-stranded linker. Both linkers contain compatible 
termini to replace the carboxyl splice junction cassette following 
BamH\-Stu\ cleavage of pMIP21. 2 (ig of pMIP21 DNA was digested with 
40 units of BamHl (NEB) in 1 x BamH\ buffer (NEB) supplemented with 
100 ng/ml BSA at 37°C for 4 hours, extracted with chloroform and 
precipitated in 0.3 M NaAcetate (pH5.2) and 50% 2-propano! at -20°C 
for 2 hours. DNA was recovered by spinning for 10 min at 10,000 rpm in 
a microfuge, dried and resuspended in 88 \i\ sterile water. The 
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Ba/nHI-digested pMIP21 DNA was then digested with 40 units of Stu\ 
(NEB) in 100 of 1x Buffer 2 (NEB) at 37°C for 3 hours, extracted with 
chloroform, precipitated in 0.3 M NaAcetate (pH5.2) and 50% 2-propanol 
at -20°C for overnight. pMIP2l DNA was recovered by spinning for 10 
5 min at 10,000 rpm in a microfuge, dried and resuspended in 30 u.l sterile 

water. 0.1 u.g BamH\-Stu\ digested DNA was ligated at 23°C for 6 hours 
with 6 pmols of linker MIP303F/MIP304R or MIP31 1 F/MIP31 2R in 1 0 ul 
of 1x T4 DNA ligase buffer (NEB) in the presence of 40 units of T4 DNA 
ligase (NEB). The ligated DNA was used to transform E. coli RR1. 
1 0 pMIP23 contains the Ala 53 5 to Lys substitution and pMIP28 contains the 

HiS536 to Leu substitution. Expression of the modified MIP fusions, 
MIP23 and MIP28, was tested by western blot analysis with anti-MBP 
antibody as described above. The results indicated that splicing activity 
was blocked in both fusion constructs. However, each modification 
1 5 resulted in increased cleavage activity at only one of the splice junctions. 

The Ala 53 5 to Lys substitution in MIP23 drastically enhanced cleavage 
activity at the carboxyl splice junction and the His 536 to Leu substitution 
in MIP28 showed strong amino splice junction cleavage. 

PURIFICATION OF MODIFIED MIP FUSION PROTEINS AND 
THFRMAL INnilCIBLE CLFAVAGE AC TIVITY _ 

Expression of the fusion proteins was induced at low temperature 
and MIP fusion proteins were purified by amylose resin columns. RR1 

25 harboring pMIP23 or pMlP28 were cultured in 1 liter of LB medium 

supplemented with 100 jig/ml ampicillin at 30°C until OD 60 0nm reached 
0.5. After the cultures were cooled on ice to about 15°C, IPTG was 
added to a final concentration of 0.3 mM, and the cultures were grown at 
12°C-14°C for 12 additional hours. Cells were pelleted, immediately 

30 frozen at -70°C and stored at -20°C. The pellets were separately 

sonicated in column buffer (10mM Tris pH8.5, 500mM NaCI) and spun 
down. The cleared lysate from each MIP fusion was loaded over 
amylose resin (NEB Protein fusion and purification system), washed and 
eluted with maltose (as described in Example 9). 
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A purified sample of MIP23 was diatyzed in 20mM NaP04 
(pH6.0)/500mM NaCI at 4°C. The sample was then incubated at 4°C, 
37°C, 50°C f and 65°C for one hour and then electrophoresed on a 
4/20% SDS-PAGE gel followed by Coomassie Blue staining (Figure 
5 15A). The gel shows that with an increase in temperature MIP23 does 

not form the ligated product (MP) or the excised product (I), as the 
original construct does but instead accumulates the C-termina! cleavage 
products (Ml, 103 kD and P, 29 kD). 

10 A purified MIP28 sample was dialyzed in 20mM NaPCXi 

(pH6.0)/500mM NaCI at 4°C for 1.5 hours. The sample was then 
incubated at 4°C, 42°C, 50°C, and 65°C for one hour and mixed with 1/5 
volume of 5x Protein sample buffer (125mM Tris f 700mM 
b-mercaptoethanol f 2% SDS.15% glycerol and 1mg/ml Bromophenol 

15 Blue). The protein products were analyzed by a 4/20% SDS-PAGE 

followed by Coomassie Blue staining (Figure 15A and 15B). The data 
indicated that splicing activity was completely blocked under these 
conditions. Cleavage activity at the amino splice junction was increased 
corresponding to the increase in temperature, yielding more MBP (M ( 43 

20 kD) and CIVPS3-paramyosin ASal (IP, 89 kD) at 65°C. 

These results show that the splice junction cassette replacement 
method can be utilized to modify the splice junctions in a fusion construct 
and such modifications may result in drastic effects on splicing and 
25 cleavage activity. Furthermore, this data gives examples of constructs 

where cleavage at only one splice junction is observed in the absence of 
ligation and total excision of the CIVPS. 
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EXAMPLE 11 



^jcT RnrvriON a nd PURIFICATION OF MLC 
ppp, fi rFMFN T np PORF1GN HFNF IN C1VPS FUSIONS 

A three-part fusion protein (MIP), composed of a binding domain 
for easy purification, a splicing domain (CIVPS3), and a target prote.n 
(paramyosin), was constructed as described in Example 9. Th.s 
construct was purified and shown to be able to splice by thermal 
activation To test the ability of this system to accept different target 
proteins, paramyosin in the MIP construct was replaced by the chit.n 
binding domain (CBD) from the Saccharomyces cerevisiae ch*nase 
gene (Kuranda and Bobbins. J. Biological Chem., 266(29):1 9758-1 9767 
(1991)) The ability of this second protein fusion to splice and form both 
ligated and excised products shows that this fusion method can be 
employed with other foreign proteins. In addition, the chitin binding 
domain can be used as an alternate binding protein for prote.n 
purification. 

QY NTHPS'S jmf CHITIN BINDING PQMAIN (QBQ\ 

A chitin binding domain was synthesized by PCR as described in 
the previous examples, with the following modifications. The PCR 
mixture contained Vent DNA polymerase buffer (NEB), 200 \iM of each 
dNTP 10pmoles of each primer. 20ng of plasmid DNA and 1 un.t of Vent 
DNA polymerase in 100 u.l. Amplification was carried out using a 
Perkin Elmer thermal cycler at 95°C for 30 sec. 55°C for 30 sec. and 
72°C for 30 sec for 20 cycles. The chitin binding domain was 
synthesized from P CT30. a plasmid containing the Saccharomyces 
cerevisiae cmnase gene (Kuranda and Robbins. J. Biological Chem., 
266(29):1 9758-1 9767 (1991)). 

The forward primer, primer 99-02. 5'-GTCAGGCCTCTCAGACAGT 
ACAGCTCGTACAT-3' (SEQ ID NOS4) has a Sful s,Ie (AGGCCT (SEQ 
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ID NO:55)) at the 5 1 end. 22 bases at the 3' end of the primer are 
identical to the 5' end of the chitin binding domain of the chitinase gene. 
The reverse primer, primer 99-03, 5'-CCCCTGCAGTTAAAAGTAATTGC 
TTTCCAAATAAG-3 1 (SEQ ID NO:56) has a Pstt site (CTGCAG (SEQ ID 
NO:57)) at the 5' end. 26 bases at the 3' end of the primer are identical 
to the antisense strand at the 3' end of the chitin binding domain of the 
chitinase gene. Primers 99-02 and 99-03 were used to synthesize the 
chitin binding domain cassette (270bp). 

The PGR sample was extracted with phenol-chloroform (1:1 
mixture) and the DNA was precipitated in 0.5M NaCI and 2 volumes 
isopropanol at -20°C for 30 min. The DNA was spun down, dried and 
resuspended in 40|il TE buffer. (10mM Tris-HCI, 0.1 mM EDTA, ph 7.5) A 
digest containing 20jal of the resuspended DNA, 21^1 distilled water, 5|il 
10X NEB Buffer #2, 40 units Pstt (NEB) and 20 units Stu\ (NEB) was then 
carried out at 37°C for two hours in a 50jil volume. The reaction was 
loaded on a 1.8% low melt agarose gel for electrophoresis. The 0.25kb 
Pst\/Stu\ digested product was excised from the gel and incubated at 
65°C until the gel melted. 0.25 ml TE buffer at 65°C was added and the 
sample was phenol-chloroform extracted (1:1 mixture). The DNA was 
precipitated in 0.5M NaCI and 2 volumes isopropanol at -20°C for 30 
min, spun down, dried and resuspended in 40jjJ TE buffer. 

PREPARATION OF PM1P21 

A Psti/Stu\ double digest separates the paramyosin coding region 
from the remainder of the pMIP21 , described in Example 10. Two 
samples of 5jig each of pMIP21 were digested with 60 units Pstt (NEB) 
and 30 units Stu\ (NEB), 5|il of NEB buffer #2, and 34jil distilled water in 
a 50}il volume at 37°C for two hours. The reactions were loaded onto a 
1% low melting agarose gel. The 8.1kb band was excised and purified 
from the gel as above, and resuspended in 40jit TE buffer. 
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CONSTRUCTION OF MBP-DEEP VENT IVPS1-CBD FUSION 
(MIC) 

The chitin binding domain was substituted for paramyosin in 
MIP21 as follows to create MBP-Deep Vent IVPS1-CBD constructs 
(MIC). 1u.l of 8.1 kb pMlP21 fragment, 10jol of chitin binding domain (both 
prepared as described above) were combined with 9.5uf distilled water, 
2.5uJ of 10X T4 DNA ligase buffer (NEB), and 800 units of T4 DNA ligase 
(NEB) and incubated at 16°C for 4 hours in a 25ul volume. 

E. coli strain RR1 tonA (NEB) was transformed by (1) mixing 100ul 
of competent RR1 tonA cells with 5uJ of ligation sample and 100uJ of a 
1:2 mix 0.1MCaCt 2 and 1XSSC(0.15M NaCI, 15mM NaCitrate) on ice 
for 5 min., (2) heating at 42°C for 3 min., (3) chilling in ice for 5 min and 
(4) plating onto an LB amp plate. After incubation overnight at 30°C, 
about 200 colonies were observed. 

Alkaline lysis mini-prep DNA (Sambrook, supra) was utilized to 
screen for clones that carry recombinant plasmids with the chitin binding 
domain. When digested with Pst\ and Stu\, the positive clones had a 
band corresponding to chitin binding domain and a band corresponding 
to the vector. The restriction enzyme digests were carried out by mixing 
10nl miniprep DNA, 2.5ul NEB buffer #2, 8.5 ul distilled water, 40 units 
Ps/1 (NEB) and 20 units Stu\ (NEB) in a 25 ^1 volume at 37°C for 2 hours. 

FYPRESS|HN OF TH * MIH FUSIONS 

To verify MIC constructs, small scale protein preparations were 
analyzed on Coomassie Blue stained gels and western blots. The 
positive clones were cultured in LB Media supplemented with 100ug/ml 
ampicillin at 30°C until OD 600 reached approximately 0.5. To prepare 
lysate from uninduced cells, 1.5ml of culture was pelleted and 
resuspended in 25ul 5X Protein sample buffer (125mM Tris, 700mM 
b-Mercaptoethanol, 2% SDS. 15% glycerol and 1mg/ml Bromophenol 
Blue) Protein samples from induced cultures were prepared as follows. 
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After cooling the cultures to 12°C, IPTG was added to a final 
concentration of 1mM and the cultures were grown at 12°C for 5 
additional hours. After 2 hours of induction, a 1.5ml sample was taken 
and after 5 hours of induction a 3ml sample was taken. Samples were 
pelleted, resuspended in 50|il 5X protein sample buffer, frozen at -20°C 
for 16 hours, and then, thawed and boiled for 5 minutes. The protein 
products were analyzed by Coomassie Blue stained gels and Western 
blots using anti-MBP antibody (NEB). The samples were 
electrophoresed on 4-20%SDS gels (ISS, Daichi, Tokyo, Japan) with 
prestained markers (BRL), transferred to nitrocellulose, probed with 
anti-MBP antibody, and detected using alkaline phosphate-linked 
anti-rabbit secondary antibody as described by the manufacturer 
(Promega). A predicted major band at about 1 1 0kDa for the MIC fusion 
protein was observed in both the Coomassie Blue stained gels and 
Western blots. 

LARGE SCALE PURIFICATION OF MIC ON AMYLOSE AND 
MONOQ COLUMNS 

Single colonies were used to inoculate 3x1 0ml LB media 
supplemented with 100jig/ml ampicillin and incubated at 30°C 
overnight. These cultures were used to inoculate 3x1 liter LB media 
supplemented with 100jig/ml ampicillin and incubated at 30°C until 
OD 600 reached 0.5. The cultures were then transferred to 12°C and 
induced with 1mM IPTG overnight. The cells were pelleted and 
resuspended in column buffer (10mM Tris-HCI pH8.5, 500mM NaCI), 
sonicated, spun down and the cleared culture lysate loaded over 
amylose resin (NEB Protein fusion and purification system). Fusion 
protein was eluted with maltose (as described by the manufacturer) and 
examined on an SDS-PAGE gel. The amylose resin eluate was further 
purified by chromatography on FPLC MonoQ anion exchange resin 
(Pharmacia). The column was washed with 0.2M NaCI, 10mM Tris-HCI 
pH8.5 and eluted with a linear gradient of NaCI from 0.2 to 1.0M NaCI in 
10mM Tris-HCI, pH8. 5. Protein eluted between 0.4-0. 6M NaCI. The MIC 
and MIP protein fusion products purified similarily on both the amylose 
resin and the MonoQ resin. 
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EXCISION AND LIGATION OF THE MBP-DEEP VENT 
|Vp.Q1-nBD FUSION — 

5 An amylose purified sample of MIC was dialyzed to 20mM NaP04 

pH6.0, 500mM NaCl. The sample was then heat treated at 4°C, 37°C, 
50°C, and 65°C for one hour and then examined on an SDS-PAGE gel. 
(Figure 16) The gel shows an abundance of MIC precursor, 
approximately 110kDa, in the 4°C sample which decreases after thermal 

1 0 induction. Along with the decrease in precursor, an accumulation of 

ligated product of approxiamtely 53kDa in size, MBP-CBD(MC), and 
excised product of approxiamtely 60kDa in size, Deep Vent 
IVPS1(l=l-Pspl), is observed with the increase in temperature. Also, the 
gel shows that bands of the same size as cleavage products, MBP-Deep 

1 5 Vent IVPS1 (Ml), approximately 1 03kDa, and Deep Vent IVPS1 -CBD(IC), 

approximately 70kDa, are present. 



EXAMPLE 12 



TRANS -SPLICING 

This Example demonstrates that in vitro splicing can occur in 
trans between halves of a precursor protein. The position at which to 
split MIP (Example 9 and Xu et al., Cell. 75:1371-1377 (1993), the 
disclosure of which is hereby incorporated by reference herein) was 
chosen immediately upstream of a methionine residue in the native 
CIVPS3, although other sites might work equally well, including sites 
which result in gaps or overlapping CIVPS sequences. In this example, 
one of the MIP half proteins was insoluble and splicing in trans was 
therefore performed in urea. Partial or full denaturation should not be 
construed as a requirement in general, since other separation points 
may result in solubility of both halves and since the insoluble half can be 
rendered soluble for trans -splicing experiments under non-denaturing 
conditions. 
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CONSTRUCTION OF Ml' 

A fusion of the malE gene (encoding MBP) with the first 249 
amino acids of the CIVPS3 gene was synthesized by polymerase chain 
reaction (PCR) from pMIP21 (Example 10 and Xu et al., supra (1993) 
carrying a fusion between malE, C1VPS3 and D. immitis paramyosin 
ASaJ genes using the forward primer 5'-GGAATTC CATATG AAAATCG 
AAGAAGGT-3* (SEQ ID NO:58) (Nde I site underlined) and the reverse 
primer 5'-CGGJaAT£C_CGTTATAGTGAGATAACGTCCC G-3* (SEQ ID 
NO:59) (BamHI site underlined). PCR reaction mixtures contained Vent 
DNA polymerase buffer (New England Biolabs, Inc.; Beverly, 
Massachusetts), 400 mM each dNTP, 0.84 mM primers, 5 mg/ml plasmid 
DNA, and 20 U/ml Vent Exo + DNA polymerase (New England Biolabs, 
Inc.; Beverly, Massachusetts) in 50 ul. Amplification was carried out 
using a Perkin-Elmer Cetus thermal cycler at 94°C for 30 seconds (s), 
52°C for 30 s, and 72°C for 135 s for 15 cycles. Restriction enzyme 
digests were performed as described by the manufacturer (New 
England Biolabs, Inc.; Beverly, Massachusetts). Gel purified Ndel/BamHI 
digested PCR products were ligated directly into gel purified 
BamHI/Ndel digested pAII-17 T7 vector (Perler et al., Proc. Natl. Acad. 
Sci. USA, 89:5577-5581 (1992), the disclosure of which is hereby 
incorporated by reference herein) to create pMI/L249 (Sambrook, 
Molecular Cloning: A Laboratory Manual . 2nd Edition, (1989). Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, NY) the 
disclosure of which is hereby incorporated by reference herein). E. colt 
ER2169 pLysS (BL21(DE3) X P1vir (ER1489)-->Tet R )McrB-)) was 
transformed with pMI/L249 to create NEB941. The protein produced by 
NEB941 was called Ml' for MBP (maltose binding protein)-CIVPS3 N- 
terminal domain (IVPS) fusion. 

CONSTRUCTION OF I'P 

Restriction enzyme digests were performed as described by the 
manufacturer (New England Biolabs, Inc.; Beverly, Massachusetts). Gel 
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purified Xbal/Bpu 1 1021 digested pET-21b fragments carrying the 
polylinker site and the 6 histidine tag sequence (Novagen; Madison, 
Wisconsin) were ligated directly into gel purified Bpu 1 1 02I/Xbal 
digested pAII-17 T7 vector DNA (Sambrook. supra (1989)) to create the 
pPHT (Polylinker-HisTag)-T7 vector used for expression of PP. 

A fusion of the last 288 amino acids of the CIVPS3 gene with the 
D. immitis paramyosin ASal gene was synthesized by PCR from pMIP21 
(Xu et al., supra (1993)) using the forward primer 5'-GGAATTCC_AT_AJj2. 
CCAGAGGAAGAACTG-3' (SEQ ID NO:60) (Nde I site underlined) and 
the reverse primer fi'-ATAGTTTA GCGGCCGCT CACGACGTTGTAA 
AACG-3' (SEQ ID NO:61) (Not I site underlined). PCR mixtures were as 
described above, except in 100 |il. Amplification was carried out using a 
Perkin-Elmer Cetus thermal cycler at 94°C for 30 s, 52°C for 30 s, and 
72°C for 105 s for 10 cycles. Gel purified Ndel/Notl digested PCR 
products were ligated directly into gel purified Notl/Ndel digested pPHT- 
T7 vector to create pl/M250-PH (Sambrook, supra (1989). E.CO//ER2169 
pLysS) was transformed with pl/M250-PH to create strain NEB942. The 
protein produced by NEB942 was called VP for CIVPS3 C-terminal 
domain (IVPS) -D. immitis Paramyosin ASal-HisTag fusion. The C- 
terminal domain has no additional amino acids since it begins with a 
methionine present in CIVPS. 

Ml' EXPRESSION AND PURIFICATION 

NEB941 was grown at 30°C in LB medium plus 100 u.g/ml of 
ampicillin to an OD600 of =0.5. The culture was induced at 30°C with 
0.4 mM isopropyl B-D-thiogalactoside (IPTG) and immediately 
transferred to a 22°C air shaker in a cold room overnight. The cells were 
harvested at 4°C and stored at -20°C. Frozen cells from a 1 liter culture 
were resuspended in 50 ml of amylose column buffer (0.01 M Tris-HCI 
pH 8.5, 0.2 M NaCl, 1.0 mM Na2-EDTA) and broken by sonication. After 
centrifugation at 9,000 g for 30 min, the crude supernatant was passed 
through an amylose column (New England Biolabs, Inc.; Beverly, 
Massachusetts, 5 ml of resin), and the , >lumn was washed with 50 ml of 
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the above buffer. Maltose, at a final concentration of 10 mM, was added 
to the column buffer and the elution continued until the MBP fusion was 
eluted. 

IT EXPRESSION AND PURIFICATION 

A HisTag was included in the construction of I'P to facilitate 
purification. When I'P protein was expressed in E.coli, approximately 
90% was insoluble, which is common with many HisTag (6-10 
histidines) fusion proteins. Therefore, I'P samples were solubilized in 6M 
urea for purification and chromatographed over a Ni 2+ affinity resin. 

NEB942 was grown at 30°C in LB medium plus 1 00 u.g/ml 
ampicillin to an OD600 of =0.5. The culture was induced at 30°C with 0.4 
mM IPTG overnight. The cells were harvested at 4°C and stored at 
-20°C. Frozen cells from a 1 liter culture were thawed in 130 ml of 
amylose column buffer (0.2 M NaCI, 0.01 M Tris-HCI pH 8.5. 1.0 mM 
Na2-EDTA) and broken by sonication. After centrifugation at 20,000 g for 
30 min, the pellet containing insoluble material, including the I'P protein, 
was resuspended in 130 ml of column buffer and centrifuged as before. 
The washed pellet was resuspended a second time in 130 ml of column 
buffer and spun as before. The twice washed pellet was finally 
resuspended in 130 ml of Ni 2 + binding buffer (20 mM Tris-HCI pH 7.9, 
500 mM NaCI, 16 mM Imidazole) complemented with 6 M urea. The 
solubilized pellet was stirred overnight at 4°C, then centrifuged a last 
time at 31 ,000 g for 1 hour. The supernatant was filtered through a 0.45 
mM membrane (Millex, Millipore; Bedford, Massachusetts), passed 
through a Ni 2+ charged column (Novagen; Madison, Wisconsin, 2.5 ml 
of resin), and the column was washed with 10 volumes of binding buffer. 
Imidazole at a final concentration of 60 mM was added to the binding 
buffer and elution of contaminant proteins was continued until 
undetectable by Bradford assay (BioRad; Hercules, California). The I'P 
fusion protein was eluted with 180 mM of imidazole in the binding buffer 
and elution continued until the fusion had eluted completely as shown 
by the above assay. 
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Two complementary halves of MIP were constructed as described 
above. The product of the N-terminal half of MIP, containing all of MBP 
and the N-terminal domain of CIVPS3 (amino acids 1-249) was termed 
Mr and the product of the C-terminal half of MIP. containing the C- 
terminal domain of CIVPS3 (amino acids 250-537) and all of 
Paramyosin ASal was termed PP. Unfortunately, PP was insoluble, and 
needed to be solubilized and purified in 6 M urea. The denaturation and 
renaturation of enzymes with recovery of enzymatic activity has been 
reported in the literature (Burbaum and Schimmel, Biochemistry, 
30:319-324 (1991); Hattori, et al, J. Biol. Chem., 268:22414-22419 
(1993); Sancho and Fersht, J. Mol. Biol., 224:741-747 (1992), among 
others, the disclosures of which are hereby incorporated by reference 
herein). However, each protocol differs. The initial protocol chosen for 
this study involved mixing both halves of MIP in urea, incubating at 4°C, 
rapidly diluting the proteins and then allowing the diluted proteins to 
refold. This was followed by a standard in vitro splicing protocol (Xu et 
al., EMBO J., 13:5517-5522 (1994); Xu et al., supra (1993), the 
disclosures of which are hereby incorporated by reference herein) after 
concentration of the diluted proteins, although this concentration step is 
not necessary. Variation of the different parameters including initial 
concentration, urea concentration (or other denaturants), dilution factor, 
length of incubation and protein ratio, allows the optimization of 
refolding and trans -splicing efficiencies. 

Purified MP and PP fusion proteins were exchanged with buffer A 
(50 mM Tris-HCI pH 7.5, 5% acetic acid, 0.1 mM EDTA, 1 mM DTT and 
140 mM beta-mercapto-ethanol) supplemented with 7.2 M urea and 
equilibrated at pH 7.5 prior to use. Macrosep (15 ml) and Microsep (3.5 
ml) concentrator devices (Filtron Technology Corp.; Northborough, 
Massachusetts) were used in every step that required a buffer exchange 
or a protein concentration as described by the manufacturer. The two 
fusions were then mixed together at a final concentration of 2.3 mg/ml 
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and incubated overnight at 4°C. The mixture was diluted 50-fold in buffer 
B (Tris-HCI pH 6 f 500 mM NaCI) and renaturation was allowed to occur 
during the 2 hour concentration step to 0.5-2 mg/ml at 4°C. The mixture 
was heated in a Perkin-Elmer Cetus thermal cycler at 42°C for 1 hour to 
5 induce splicing. To follow the splicing reaction, samples were collected 

at time-points and Western blots (Sambrook, supra{ 1989)) were 
performed in duplicate with either mouse sera raised against CIVPS3 
(anti-PI-PspI) or paramyosin ASal (Steel et al., J. Immunology, 
145:3917-3923 (1990), the disclosure of which is hereby incorporated 

10 by reference herein). In later experiments, the concentration step after 

mixing was found to be unnecessary. Unfortunately, Western blots are 
necessary to follow splicing because both the substrate, Ml\ and the 
product, MP, have similar molecular masses (approximately 72 kDa). 
The anti-paramyosin antibody is diagnostic, since it shows the decay of 

1 5 the TP substrate (approximately 60 kDa) and the formation of the MP 

product (-72 kDa). On the other hand, anti-MBP sera (New England 
Biolabs, Inc.; Beverly, Massachusetts) which reacts with the similarly 
sized Ml' and MP, is not diagnostic since as Ml' decreases, MP 
increases at the same position in the gel. As a result, the anti-MBP sera 

20 detects a relatively constant band at 72 kDa. The Western blot with the 

mouse anti-CIVPS3 sera demonstrates the decay of the substrates (Mr 
and PP) and the formation of the P products (which are often inseparable 
during electrophoresis because of their similar molecular masses). 
Western blots using anti-Paramyosin antibodies show that there is no 

25 cross-reactivity, since anti-Paramyosin sera fails to react with MP (Figure 

17B). Anti-ClVPS3 (anti-PI-Pspl) antibody was shown to react with both 
MP and IP (Figure 17A). 

Protein splicing of MIP, in cis, is more efficient at high 
30 temperatures (up to 65°C) and low pH (6.0) (Xu et al., supra(1 994); Xu et 

al. f supra(1 993) and Example 11, After a few assays, the splicing 
reaction for trans -splicing was set at 42°C, pH 6.0, although other 
temperatures and pH's also work. A time course of trans -splicing is 
shown in Figure 17A and 17B. The trans -splicing reaction is best 
35 monitored by the accumulation of the 72 kDa MP as shown on Western 
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blots using anti-Paramyosin sera and the decrease in Ml and I P and 
the formation of r using anti-ClVPS3 sera (anti-PI-Pspl, F.gure 17A). In 
this experiment, both Ml' and I'P were exchanged into 7.2 M urea in 
buffer A using a Microsep concentrator (nitron Technology Corp.; 
Northborough, Massachusetts) and mixed at a final concentration of 
img/ml each protein. The mixtures were incubated overnight at 4 C and 
then diluted 50-fold into buffer B. Diluted samples were immediately 
placed at 42°C or 4°C. Samples were taken after 5. 10, 20, and 30 
minutes of incubation and placed on ice. A zero time point was taken 
prior to placing the tube at 42°C. 5ul of each time point was 
electrophoresed in duplicate 5-20% SDS-PAGE gels (Daiichi, Tokyo 
Japan) and Western blots were performed (Perler et al., supra(1992 , 
Sambrook, surpa(1989)) with either anti-CIVPS3 (anti-PI-Pspl) or ant,- 
Paramyosin sera. No trans -splicing was observed in the 
incubated at 4°C. Within 5 minutes at 42°C. the branched mtermed.ate 
(Ml'P*) was observed and by 10 minutes, spliced products (MP and both 
V) were observed (Figure 17A and 17B). 

PF- FSTABLIP HMFNT " F '- pqp 1 ACTIVITY 

After trans -splicing, the protein mixture was tested for l-Psp I 
activity (l-Pspl or Pl-Pspl is the same as CIVPS3 or I in this example). 
The substrate DNA used for l-Pspl digestion is pAKR7, which ^was 
generated by subcloning a 714 bp EcoRl fragment from pAKK4 (Perler 
et al.. supra (1992)) into the EcoRl site of Bluescript SK-. Th.s .714 bp 
fragment contains the coding region surrounding the s.tes where IVPS1 
and IVPS2 were found in the wild type Vent DNA polymerase .don. 
Cleavage with Xmn. and l-Pspl should give fragments ^ ^ ^ 
1351 bp Test substrate DNA, pAKR7, was reacted w.th e.ther Ml IP, 
Z "trans -splicing reaction products or c/s-spliced M.P52 (Figure 18). 
PAKR7 was cut with Xmnl to linearize the plasmid at a point near the I- 
Psp, restriction site. 5ug of pAKR7 DNA was 

(New England Biolabs. Inc.; Beverly. Massachusetts) m NEB buffer 2 for 
100 ml at 37°C. One microgram of linearized pAKR7 DNA was m.xed 
with 0.01 , 0.1 or 1 W) of either Mr. I'P or the trans -splicing react.on 
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products in a final volume of 55^1 l-Pspl buffer (New England Biolabs, 
Inc.; Beverly, Massachusetts) and incubated at 50°C for 1 hour. In an 
identical reaction, MIP52 protein was used as a control. MIP52 is a 
mutant form of MIP containing an insert of MILVA prior to Sen of 
CIVPS3 and this insertion has no effect on splicing or endonuclease 
activity. MIP52 was used rather than l-Psp I because the c/s-spliced 
mixture more closely mimicked the trans -spliced mixture. The MIP52 
control sample contained precursor MIP and c/s-spliced MP and I 
products. Endonuclease activity was only present in the MIP52 enzyme 
and the trans -spliced mixture, indicating that the above trans -splicing 
protocol not only re-establishes the ability to splice, but also re- 
establishes endonuclease activity in CIVPS3. As another control, Ml' 
and I'P were added separately to a digestion mixture as above; no 
digestion was observed (Figure 18), indicating that both protein 
fragments are required to restore endonuclease activity. 

EXAMPLE 13 

TflA MS-CLEAVAGE 

In this Example, we describe cleavage at the C-terminal of 
CIVPS3 in trans using the MIP fragments described in Example 12 as a 
starting point. 

CONSTRUCTION OF MI'22 CONTAINING A ILE2LYS 
MUTATION IN CIVPS3 

In Example 12, we described the construction of Ml" and I'P, 
which were used for frans-splicing. In this example we replaced the 
splice junction cassette in pMl/L249 (which encodes Ml') with a duplex 
oligomer which replaces Ile2 of CIVPS3 with Lys. The techniques used 
are as described in Examples 10 and 12. Briefly, prior to ligation with the 
vector, pMI/L249, two complementary oligomers, DVMIP525FW (5'-TC 
GAGGCTAGCAAATTACCGGAAGAATGGGTAC-3' (SEQ ID NO:62)) and 
DVMIP526RV (5'-CCATTCTTCCGGTA - TTTGCTAGCC-3' (SEQ ID 
NO:63)) were allowed to anneal to form a double-stranded linker, 
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DVMIP525FW/DVM1P526RV. 100 pmol of each oligomer was incubated 
to50 ul of 1 x T4 DNA ligase buffer (New England Biolabs. Inc.; Beverly. 
Massachusetts) at 68°C for 15 min and slowly cooled to 20-3CW. 
PMI/L249 DNA was digested with Xhol-Kpnl (New England B.olabs, Inc.. 
Beverly Massachusetts) as described by the manufacturer and the 
linear piasmid was purified after electrophoretic separation using the 
Gen c S kit (B101 01 ; Vista. Ca.ifornia). 0, M of Xho.-Kpn.- d.gested 
Pm7l249 DNA was ligated overnight at 16°C in 10 1x T4 ..gase buffer 
(New England Biolabs. Inc.; Bever,y. Massachusetts) w,th 80 un* of T4 
DNA ligase (New England Biolabs. Inc.; Beverly. Massachusetts) and 
15 5 pmol of the linker DVM1P525FW/DVMIP526RV. The result.ng 
construe, was termed pMI'22 and the protein produced by th.s Cone was 
called Ml'22. 

p1[fll prATic ;r| "f W* tMP |,p 

rP was punned as described in Example 12. Ml'22 w* > purified 
as described for Ml' in Example 12. E.co,> strain ER2497 (NEB975J , was 
formed with pMi'22 and grown at 30"C in LB med.um plus 100 
M/m. ol ampidllin .o an OD 600 °< -0.5. The "*£L The 

broken by sonication. After centrifugation at 9.000 g for 20 mm, the 
crute supernatant was diluted two-fold in column bufler and passed 
Zugh an amylose column (New England Biolabs. Inc, Beverly 
Massachusetts 12.5 ml of resin), and the column was washed w>th 60 
nToUhe above buffer followed by 60 ml of amylose 
ml ot the aoo concen tration of 10 mM. was added 

^ Th Sol" — atd the eiuticn continued until the MBP fusion 

was eluted. 
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T7?4/VS-CLEAVAfiF 

Two complementary halves of MIP were constructed as described 
above. The product of the N-terminal half of MIP, containing all of MBP 
and the N-terminal domain of CIVPS3 including the Ile2l_ys substitution 
(amino acids 1-249) was termed Ml'22 and the product of the C-terminal 
half of MIP, containing the C-terminal domain of CIVPS3 (amino acids 
250-537) and all of Paramyosin ASal was termed PP. The products of 
the f/a/is-splicing reaction are the unchanged Ml'22 and the cleaved PP 
which forms the P fragment and the P fragment, both of which are 
approximately 30 kDa. Unfortunately, PP was insoluble, and needed to 
be solubilized and purified in 6 M urea. The initial protocol chosen for 
this study was as described in Example 12, involved mixing both halves 
of MIP in urea, incubating at 4°C, rapidly diluting the proteins and then 
allowing the diluted proteins to refold. This was followed by a standard 
in vitro splicing protocol (Xu, M., et al., supra (1994); Xu, M., et al., supra 
(1993)). Variation of the different parameters including initial 
concentration, urea concentration (or other denaturants), dilution factor, 
length of incubation and protein ratio, allows the optimization of 
refolding and frans-cleaving efficiencies. 

Approximately 10 ng each of purified MP22 and PP fusion protein 
were mixed in 24 u,l total volume of Novagen His Tag column binding 
buffer (20 mM Tris, HCI, pH 7.9, 0.5 M NaCI, 5 mM imidazole) adjusted 
to 6M urea and incubated on ice for 90 minutes. The sample was diluted 
25-fold in 20 mM sodium phosphate buffer, pH6, 0.5 M NaCI and 1 mM 
EDTA and incubated at 42°C. Samples were taken and placed on ice at 
0, 5, 10, 20,40 and 90 minutes. Samples were boiled in SDS-PAGE 
sample buffer (Sambrook et al., supra (1989)), electrophoresed on 4- 
20% gradient SDS-PAGE (Daiichi, Tokyo, Japan) and stained with 
Coomassie blue. As seen in Figure 19, PP (-60 kDa) disappears with 
time and P and P appear at approximately the same position in the gel 
(~30kDa). Control samples which are not shown include incubating the 
mixture of MP22 plus PP at 4°C or incubating each protein fragment 
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separately at 42°C; none of these control experiments showed any 
cleavage activity. 

EXAMPLE 14 

5 

CHEMICAL CONTROL OF I VPS ACTIVITY 

In previous Examples, we have demonstrated that splicing and 
cleavage activities of IVPSs can be controlled by amino acid 
10 substitution, temperature and pH. In this Example, we demonstrate that 

chemical treatment may also be used to activate or inhibit IVPS activity. 
Thus, an IVPS can become a CIVPS when its activity can be controlled 
by chemical treatment. In Example 10, we described modification of 
CIVPS3 in the MIP fusion by cassette replacement which resulted in 
1 5 cleavage at one of the splice junctions instead of splicing. pMIP21 

contains two unique restriction sites at each splice junction: an Xhol site 
and a Kpnl site flanking the N-terminal splice junction and a BamHI site 
and a Stul site flanking the C-terminal splice junction (Figure 14, 
Example 10). The N-terminal splice junction residue(s) can be changed 
20 by replacing the Xhol-Kpnl cassette, while the C-terminal splice junction 

residue(s) can be altered by substituting the BamHI-StuI cassette. In the 
case of the N-terminal cassette replacement, pMIP21 is first digested 
with Xhol and Kpnl. A cassette carrying desired mutations, formed by 
annealing two complementary primers, is substituted for the original 
25 Xhol-Kpnl cassette. Some modifications in the CIVPS may allow 

activation of cleavage or splicing activity by chemical treatment. In this 
specific example, we show that substitution of Ser1 by Cys in CIVPS3 
results in a chemical-inducible CIVPS in the Ml (a truncated form of MIP) 
context, which, upon chemical activation with hydroxylamine, results in 
30 cleavage of the bond between MBP and cysteine in the modified 

CIVPS3. 
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MODIFICATION OF CIVPS3 BY REP LACING SER1 WITH CYS 

In this Example, we first modified pMIP21 (Example 10) by 
substituting a serine with a cysteine at the N-terminal splice junction of 
CIVPS3 (SeMCys) by cassette replacement to yield pMlP47. 2 ng of 
pMIP21 was digested at 37°C for 4 hours in 100 u.l of 1x Buffer 1 (New 
England Biolabs, Inc.; Beverly, Massachusetts), 100 \ig/m\ BSA and 20 
units of Xhol and 20 units of Kpnl (New England Biolabs, Inc.; Beverly, 
Massachusetts). Following electrophoretic separation on an 1% agarose 
gel, pMIP21 DNA was purified by using the Geneclean II kit (BIO101 ; 
Vista, California). Two complementary oligomers, MIP535FW (5'- 
TCGAGGCTTGCA7TTTACCGGAAGAATGGGTAC-3' (SEQ ID NO:64)) 
and MIP536RV (5'-CCATTCTTCCGGTAAAATGCAAGCC-3' (SEQ ID 
NO:65)) were allowed to anneal to form a double-stranded linker, 
MIP535FW/MIP536RV. 100 pmol of each of oligomers MIP535FW and 
MIP536RV were incubated in 50 \i\ of 1X T4 DNA ligase buffer (New 
England Biolabs, Inc.; Beverly, Massachusetts) at 65°C for 15 min and 
slowly cooled to 20-30°C. Approximately 0.1 \ig of the Xhol-Kpnl 
digested pMIP21 DNA was ligated at 16°C overnight in 10 ^l of 1x T4 
DNA ligase buffer (New England Biolabs, Inc.; Beverly, Massachusetts)), 
80 units of T4 DNA ligase (New England Biolabs, Inc.; Beverly, 
Massachusetts) and 15.6 pmol of the linker MIP535FW/MIP536RV, to 
yield pMIP47. The ligated DNA sample was used to transform E.coli 
strain ER2426 (NEB974). 

CONSTRUCTION OF PM184 ENCODING Ml 

pMI84 was constructed in two steps by the following cassette 
replacement experiments. pMIP21 was first modified by replacing the C- 
terminal splice junction cassette with linker MIP353FW/MIP354RV to 
yield pMIP66. The linker MIP353FW/MIP354RV, containing a Sphl 
recognition sequence, was formed by annealing two complementary 
oligomers, MIP353FW ( 5'- G ATC C C TCT ATAAG C AT AATATTG G C ATG 
CAGTA-3* (SEQ ID NO:66)) and MIP354RV (5'-TACTGCATGCCAATATT 
ATGCTTATAGAGG-3' (SEQ ID NO:67)) as described above. pMIP21 
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DNA was digested with BamHI (New England Biolabs, Inc.; Beverly, 
Massachusetts) and Stul (New England Biolabs, Inc.; Beverly, 
Massachusetts) as described in Example 10. 0.1 \ig BamHI/StuI 
digested pMlP21 DNA was ligated at 16°C overnight with 16.6 pmol of 

5 linker MIP353FW/MIP354RV in 1 0 ul of 1 x T4 DNA ligase buffer (New 

England Biolabs, Inc.; Beverly, Massachusetts) in the presence of 40 
units of T4 DNA ligase (New England Biolabs, Inc.; Beverly, 
Massachusetts). After addition of 1 \i\ of 10x buffer 2 (New England 
Biolabs, Inc.; Beverly, Massachusetts) and 0.5 u.l (10 units) of Stul (New 

10 England Biolabs, Inc.; Beverly, Massachusetts), the ligated DNA sample 

was incubated at 37°C for 3 hours before E.coli ER2426 (NEB974) was 
transformed. 

pMIP66 contains unique BamHI and Sphl sites flanking the C- 
15 terminal splice junction, allowing linker replacement following BamHI 

and Sphl digestion. A stop codon was then inserted after the CIVPS C- 
terminus to create the Ml truncated fusion. Ser538 was mutated to a 
translational stop codon (TAA) by replacing the BamHl-Sphl cassette 
with the linker MIP385FW/M1P386RV. The linker was formed as 
20 described above by annealing two complementary oligomers. 

MIP385FW (5'-GATCCCTCTATGCACATAATTAAGGCATG-3' (SEQ ID 
NO:68)) and M1P386RV (S'-CCTTAATTATGTGCATAGAGG-fftSEQ ID 
NO:69)). This mutagenesis cassette contains compatible termini to 
replace the C-terminal splice junction cassette following BamHl-Sphl 
25 cleavage of pMIP66. Approximately 1 jig of pMIP66 was digested at 

37°C for 4 hours in 30 u.l of 1x BamHI Buffer (New England Biolabs. Inc. 
Beverly, Massachusetts), 20 units of BamHI and 20 units of Sphl (New 
England Biolabs, Inc.; Beverly, Massachusetts). Following 
electrophoretic separation on 1% agarose gel. pMIP66 DNA was 
30 purified by the Geneclean II kit (BIO101 ; Vista. California) in 20 ul of 10 

mM Tris-HCI (pH 8.0)/0.1 mM EDTA. Approximately 0.05 ug of the 
BamHl-Sphl digested pMIP66 DNA was ligated at 16°C overnight in 10 
ml of 1x T4 DNA ligase buffer (New England Biolabs, Inc.; Beverly, 
Massachusetts), 80 units of T4 DNA ligase (New England Biolabs, Inc.; 
35 Beverly, Massachusetts) and 16.6 pmol of the linker 
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MIP535FW/MIP536RV, to yield pMI84. The ligated DNA samples were 
used to transform E.coli strain ER2426 (NEB974). 

CONSTRUCTION OF pMI94 (Ml WITH THE SER1CYS 
MUTATION! 

The translational stop codon (TAA) introduced at the C-terminal 
splice junction in pMI84 was transferred into pMIP47 to yield pMI94 by 
ligation of a 6.6 Kb Kpnl-Pstl fragment of pMIP47 and a 2.3 Kb Kpnl-Pstl 
fragment of pMI84. 1 u.g of each pMIP47 and pMI84 DNA was incubated 
at 37°C for 4 hours in 30 ul of 1 x Buffer 1 (New England Biolabs, Inc.; 
Beverly, Massachusetts), 10 units of Kpnl (New England Biolabs, Inc.; 
Beverly, Massachusetts) and 10 units of Pstl (New England Biolabs, Inc.; 
Beverly, Massachusetts). Following electrophoretic separation on 1% 
agarose gel, the 6.6 Kb Kpnl-Pstl fragment from the pMIP47 sample and 
the 2.3 Kb Kpnl-Pstl fragment from the pMI84 sample were purified by 
the Geneclean II kit (BIO101), each in 20 u.l of 10 mM Tris-HCI (pH 
8.0)/0.1 mM EDTA. pMI94 was formed by incubation at 16°C overnight of 
1 nl of the purified 6.6 Kb pMIP47 DNA and 7.8 ul of the purified 2.3 Kb 
pMI84 DNA, 1 u.l of 10x T4 DNA ligase buffer (New England Biolabs, 
Inc.; Beverly, Massachusetts), 0.2 uJ of 400,000 units/ml of T4 DNA 
ligase (New England Biolabs, Inc.; Beverly, Massachusetts). The ligated 
DNA sample were used to transform E.coli strain ER2426 (NEB974). 
pMI94 encodes the Ml fusion protein with the SerlCys substitution 
which is present in pMI47 in the full MIP fusion context. 

PURIFICATION OF MI94 FOLLOWED BY CHEMICAL 
I NDUCIBLE CLEAVAGE ACTIVITY 

The pMI94 construct expresses the MBP-CIVPS3 fusion protein, 
termed MI94, containing a cysteine residue instead of the native serine 
residue at the N-terminal of CIVPS3. In order to conduct in vitro study of 
cleavage activity, expression of the MI94 fusion protein was induced at 
low temperature (12°C) and purified by amylose resin columns. ER2426 
(NEB974) harboring pMI94 was cultured in 2 liters of LB medium 
supplemented with 100 ng/ml ampicillin and induced as described in 
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Example 1 0. Cells were pel.eted. sweated in 1 00 ml of pH 8.5 column 
buZ (20 mM NaP0 4 . pH 8.5. 500 mV NaCI) and spun down. The 
cleared lysate was loaded over a 15 ml amylose resin column (New 
England Biolabs. Inc.; Beverly. Massachusetts). The column was 

«h»rt with 100 ml of pH 8.5 column buffer and subsequently wrth 100 
TpH 6 c^umn blr (^0 mM NaP0 4 , P H 6.0. 500 mM NaCI). UW was 
eluted with 10 mM maltose in P H 6 column buffer (as the procedure 
described in Example 9). 

Hydroxyzine (NH 2 OH) was used to activate cleavage activity at 
the N-terminal splice junction. The MI94 protein sample (0.6 "f^J^ 
^ed with 0.25M NH 2 OH a, P H 6 and P H 7. 75 „ o, the punf.ed M.94 
sample was mixed with 25 ul of 0.4 M Bis-Tns-Propane, 0.5 M NaCI and 
1 M NH2OH-HCI (Sigma, adjusted ,0 pH 6 with 6 N HC. or w,«h 25 ul ct 
0.4 M sUs-Propane, 0.5 M NaC, and 1 M NH 2 OH-HC <S,gma 
adjusted to pH 7 with 6 N NaOH. In a control experiment. 100 ul of the 
M 94 sample was mixed with 33 ul of 0.4 M Bis-Tris-Propane, 0.5 M 
NaC, adjusted .0 OH 6 with 6 N HC. 40 u. of the control sample was 
m *ed with 20 ul of 3X Protein Sample Buffer (New England B.olabs 
Tnc Beverly. Massachusetts) and stored on Ice. Two 40 ul allots o 
each mixture were incubated a. 37-C for 0.5 and 2 hours, respecm^. 
Each Sample was mixed with 20 ul of 3X Protein Sample Buffer (New 

Sabs inc • Beverly. Massachusetts) and boiled for 5 mm. 5 ul 
o, aach sCe wa efectrophoresed on a 4-12% SDS-Polyac^amide 
go ^followed by Coomassie Blue staining (Figure 20, Th. dam 
!S indicate that in comparison with the control experiment (m, u N OH 

Lroxylamine treatment drastically increased cleavage act* at the N- 
S s P »ce junction. At both pH 6 and pH 7. Ml fusion pro.e.n was 
^vmed by hydroxyzine and efficient* cleaved, yielding more MBP 
30 (M, 43 kDa) and CIVPS3 (I. 60 kDa). 

,n this Example we demonstrate that modifications of an IVPS 
mav result in drastic effects on splicing and cleavage act,v,ty after 
cTemTcal reatment. Furthermore, this data gives another example of 
const^ wnere cleavage a, N-termina, spiice junction is obsen,ed ,n 
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the absence of ligation and carboxyl junction cleavage activities of the 
CIVPS. 

EXAMPLE 15 

CHEMICAL CONTROL OF CLEAVAGE ACTIVITY OF IVPS 
FROM SAC CHAROMYCES CEREVISIAE 

Protein splicing activity of IVPS (yeast intein) from 
Saccharomyces cerevisiae has been described by Hirata et al , supra 
and Kane et al., supra. In this Example, we described the construction of 
a yeast intein fusion system similar to MIP fusion of Example 9. The 
yeast intein fusion system is a 3-part fusion composed of a maltose 
binding protein (MBP), a genetically engineered yeast intein (Y), a chitin 
binding domain (B). This yeast intein fusion system, named MYB fusion, 
can be induced to cleave at the N-terminal splicing juntion (Cys1) 
between the maltose binding protein and the yeast intein. MBP can be 
replaced by the target protein in the MYB protein purification system. 

CONSTRUCTION OF WILD-TYPE MYP 

Splice junction amino acid residues of the yeast IVPS are shown 
in Figure 1. Yeast IVPS (Gimble, et al., J. Biol. Chem., 268(29)21844- 
21853 (1993), the disclosure of which is hereby incorporated by 
reference herein) was amplified by PCR from the plasmid of pT7VDE 
and inserted into MIP21 (described in Example 10 ) between the Xhol 
site and the Stul site to replace the CIVPS3 (or the Pyroccocus IVPS). 
Primer pairs 5'-GCGCTCGAGGGGTGCTTTGCCAAGGGTACCAAT-3' 
(SEQ ID NO:70) and 5'-CCTCCGCAATTATGGACGACAACCTGGT-3' 
(SEQ ID NO:71) were used to to synthesize the IVPS fragment by PCR. 
pT7VDE plasmid DNA containing the yeast IVPS gene sequence in the 
orientation of T7 promoter, was used as template. The PCR mixture 
contains Vent DNA polymerase buffer (New England Biolabs, Inc; 
Beverly, Massachusetts), supplemented with 4 mM Magnesium sulfate, 
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400 uM of each dNTP, 1 uM of each primer, 50 ng pT7VDE DNA and 0.5 
units of Vent DNA polymerase in 50 ul. Amplification was carried out by 
using a Perkin-Elmer/Cetus (Emeryville, California) thermal cycler at 
94°C for 30 sec, 50°C for 30 sec and 72°C for 5 min for 20 cycles. The 
samples were electrophoresed on an 1% agarose gel and 
approximately 2 ug of PCR-synthesized 1 .3 Kb fragment were recovered 
in 20 ul of distilled water by Geneclean II kit (BIO101 ; Vista, California). 
The purified DNA was subjected to digestion in a 100 ul 1 X NEB buffer 
2 with 40 units of Xhol (New England Biolabs, Inc.; Beverly, 
Massachusetts). The digested DNA was extracted with phenol and 
choroform and precipitated in 0.3 M NaAcetate pH5.2 and 70% ethanol 
at -20°C overnight. DNA was spun down, dried and resuspended in 40 
ul distilled water. 0.5 ug of MIP21 DNA was digested by Xhol and Stul 
and the 7.2 Kb vector DNA was purified from 1% agarose gel by 
Geneclean II (BIO101; Vista, California) at 0.5 ug/20ul. 

MYP1 was created by ligation of Xhol-digested IVPS fragment to 
the 7.2 Kb Xhol-Stul MIP21 fragment. The reaction was carried out at 
22°C for 5 hours in 10 ul volume with addition of 2 ul of 10X T4 DNA 
ligase buffer (New England Biolabs, Inc.; Beverly, Massachusetts), 0.4 
ug IVPS DNA, 0.025 ug MIP21 fragment, and 200 units of T4 DNA ligase 
(New England Biolabs, Inc.; Beverly, Massachusetts). Transformation of 
E. coli strain RR1 with the ligation samples was performed as described 
in Example 2. Transfomants were cultured in LB medium, supplemented 
with 100 ug/ml ampicillin, for extraction of plasmid DNA using Qiagen 
spin column (Qiagen, Inc.; Study City, California). The clones were 
further examined by their ability to splice to form MP species (71 KDa). 
Nine clones carrying MYP1-9 were cultured in LB medium 
supplemented with 100 ug/ml ampicillin, at 30°C until OD600nm 
reached about 0.5. Expression of the MYP fusion gene was induced by 
addition of IPTG to a final concentration of 1 mM at 30°C for 3 additional 
hours. Cells were spun down and resuspended in 0.5 ml LB medium. 
Crude extracts were prepared as described in Example 3. Western blots 
using antibodies raised against MBP (New England Biolabs, Inc.; 
Beverly, Massachusetts) were performed to detect fusion protein and 
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splicing products expressed from these clones. Samples were 
electophoresed on 4-12% Tris-Glycine gels (Novex; Encinitas, 
California) with prestained markers (Gibco BRL; Gaithersburg, 
Maryland), transferred to nitrocellulose, probed with anti-MBP antibody 
(New England Biolabs, Inc.; Beverly, Massachusetts, prepared from 
rabbit), and detected using alkaline phosphate-linked anti-rabbit 
secondary antibody as described by the manufacturer (Promega Corp.; 
Madison, Wisconsin). Western blot analysis showed that except in MYP2 
clone, all the other 8 isolates yielded a major product of 71 kDa, 
indicating that wild-type MYP fusion protein are capable of efficient 
splicing in vivo. 

MODIFICATION OF WIL D-TYPE YEAST INTEIN 

The first modificaiton of yeast intein was to create two unique 
restriction sites (BamHI and EcoRI) on the either side of the C-terminal 
splicing junction. This would facilitate further cassette mutagenesis. 

1 jig of pMYP1 and 1 yig LITMUS 29 (New England Biolabs, Inc.; 
Beverly, Massachusetts) were digested separately in a 15 pi reaction 
mixture containing 1 x buffer 2 (New England Biolabs, Inc; Beverly, 
Massachusetts), 0.5 unit Xhol (New England Biolabs, Inc.; Beverly, 
Massachusetts), and 0.5 unit Pstl (New England Biolabs, Inc.; Beverly, 
Massachusetts) at 37°C for 2hr. After electrophoretic separation on a 
1% low melting agarose gel (FMC Corp.; Rockland, Maine), the Xho-Pst 
fragment containing the yeast intein and the digested LITMUS 29 were 
excised from the gel. The gel slices were mixed and melt at 65°C for 10 
min. The mixture was then incubated at 42°C for 10 min before 1 unit of 
B-agarase (New England Biolabs, Inc.; Beverly, Massachusetts) was 
added. After further 1 hr incubation, the mixture was ready for DNA 
ligation reaction. The ligation was conducted in 1 x T4 DNA ligase buffer 
(New England Biolabs, Inc.; Beverly, Massachusetts) containing 0.5 unit 
of T4 DNA ligase (New England Biolabs, Inc.; Beverly, Massachusetts) at 



WO 97/01642 




PCT/US96/10545 



-93- 



15°C overnight. 15 pi. of the ligation mixture was used to transform E. 
coll strain ER2267 (New England Biolabs, Inc.; Beverly, Massachusetts). 
The resulting construct was named pLit-YP, a LITMUS vector containing 
the yeast intein. 

pLit-YP was used for the synthesis of the single-stranded DNA 
and the subsequent Kunkel mutagenesis (Kunkel, T.A., PNAS (1985), 
82:488, the disclosure of which is hereby incorporated by reference 
herein). pLit-YP was first transformed into the competent E. coll strain 
CJ236 (New England Biolabs, Inc.; Beverly, Massachusetts). A single 
colony was picked to innoculate 50 ml rich LB medium. The cells were 
allowed to grow at 37°C for 2-3 hr under vigorous aeration. 50 ^L of 
M13K07 helper phage (New England Biolabs, Inc.; Beverly, 
Massachusetts) was then added to the culture. After another one hour 
culture, kanamycin was added to the final concentration of 70 u.g per mL 
culture. After overnight culture, the cells were spun down. 10mLof20% 
PEG containing 2.5 M NaCI was added into the supernatant. The phage 
which contained the single-stranded Lit-YP DNA (ss pLit-YP) was 
allowed to precipitated on ice for 1 hr. The supernatant was then 
centrifuged at 8000 rpm for 10 min. The phage pellet was resuspended 
in 1.6 mL TE buffer (20 mM Tris, pH 8.0, 1 mM EDTA). 400 u.1 of of 20% 
PEG containing 2.5 M NaCI was then added to re-precipitate the phage 
for 5 min at 25°C. The phage pellet was spun down again and 
resuspended in 600 |il TE buffer. After three times phenol extraction and 
one time chloroform extraction, the single-stranded DNA was 
precipitated in 60% ethanol containing 0.2M NaOAc. The DNA pellet 
was then dried and resuspended in 30 |al TE buffer. 

Two mutagenic primers, MYP(EcoR) (5'- GAATGCGGAATTCAGG 
CCTCCGCA-3' (SEQ ID NO:72)), and MYP (Bam) (5'-ATGGACGACAAC 
CTGGGATCCAAGCAAAAACTGATGATC-3' (SEQ ID NO:73)) were first 
5' phosphorylated. The mutagenic primers (20 pmol each) were added 
to a 20 n.L reaction mixture containing 1 x T4 polynucleotide kinase 
buffer (New England Biolabs, Inc.; Beverly, Massachusetts), 1 mM ATP, 
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and 1 unit of T4 polynucleotide kinase (New England Biolabs, Inc.; 
Beverly, Massachusetts). The reaction was conducted at 37°C for 30 
min followed by a 10-min heat inactivation of the T4 polynucleotide 
kinase at 65°C. 10 pmoi of the phosphorylated mutagenic primers were 
added to a 10 jiL reaction mixture containing 0.1 jig of the single- 
stranded pLit-YP template, 1 x annealing buffer. The reaction mixture 
was heated to 94°C for 4 min and slowly cooled to 25°C to allow the 
primers to anneal to the template. The next elongation reaction was 
conducted at 37°C for 2 hrs in a 50 |iL mixture containing 1 x T7 
polymerase buffer (New England Biolabs, Inc.; Beverly, Massachusetts), 
0.5 iig BSA, 300 mM dNTPs, 1 mM ATP, the annealed template, 1 unit of 
T7 DNA polymerase (New England Biolabs, Inc.; Beverly, 
Massachusetts) and 1 unit of T4 DNA ligase (New England Biolabs, Inc.; 
Beverly, Massachusetts). 1 5 \xL of the elongation mixture was used to 
transform the E. coli strain ER 2267. The resulting plasmid, pLit-YP\ 
contained two unique restriction sites, BamH1 and EcoFM, on the either 
side of the yeast intein C-terminal splicing junction. The Gly447 and 
S448 of the intein were mutated into Ala and Asn, respectively. 

1 jig of pMYP and 1 i^g pLit-YP' were digested separately in a 15 
reaction mixture containing 1 x buffer 2 (New England Biolabs, Inc.; 
Beverly, Massachusetts), 0.5 unit Xhol (New England Biolabs, Inc.; 
Beverly, Massachusetts), and 0.5 unit Pstl (New England Biolabs, Inc.; 
Beverly, Massachusetts) at 37°C for 2hr. After electrophoretic separation 
on a 1% low melting agarose gel (FMC Corp.; Rockland, Maine), the 
Xho-Pst fragment from pLit-YP' and the digested pMYP were excised 
from the gel. The gel slices were mixed and melt at 65°C for 10 min. 
The mixture was then incubated at 42°C for 10 min before 1 unit of B- 
agarase (New England Biolabs, Inc.; Beverly, Massachusetts) was 
added. After further 1 hr incubation, the mixture was ready for DNA 
ligation reaction. The ligation was conducted in 1 x T4 DNA ligase buffer 
(New England Biolabs, Inc.; Beverly, Massachusetts) containing 0.5 unit 
of T4 DNA ligase (New England Biolabs, Inc.; Beverly, Massachusetts) at 
15°C overnight. 15 |iL of the ligation mixture was used to transform E. 
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coli strain ER2267 (NEB#746; New England Biolabs, Inc.; Beverly, 
Massachusetts). The resulting construct was named pMYP'. 

The second modification was to replace Asn454 with Ala. This 
5 was achieved by cassette mutagenesis. 

1 jig of pMYP' was digested at 37°C for 2 hours in 1 5 nL of 1 x 
Buffer 1 (New England Biolabs, Inc.; Beverly, Massachusetts), 100 ug/ml 
BSA and 1 unit of Xhol and 1 unit of Kpnl (New England Biolabs, Inc.; 
10 Beverly, Massachusetts). After electrophoretic separation on a 1% low 

melting agarose gel (FMC Corp.; Rockland, Maine), the digested pMYP' 
plasmid DNA was excised from the gel. The gel slices were melt at 65°C 
for 10 min and then incubated at 42°C for 10 min before 1 unit of B- 
agarase (New England Biolabs, Inc.; Beverly, Massachusetts) was 
1 5 added. After further 1 hr incubation, the purified pMYP' digest was ready 

for DNA ligation reaction. Two complementary oligomers, MYP' 
(N454A)FW (5'GATCCCAG GTTGTCGTCCATGCATGCGGAGGCCTG-3' 
(SEQ ID NO:74)) and MYP'(N454A)RV (5'AATTCAGGCCTCCGCATGCA 
TGGACGACAACCTGG-3' (SEQ ID NO:75)) were allowed to anneal to 
20 form a double-stranded linker, MYP'(N454A)FW/RV. 100 pmol of each of 

the oligomers MYP' (N454A)FW and MYP' (N454A)RV were incubated 
in 20 nL of 1X annealing buffer at 90°C for 4 min and slowly cooled to 
37°C Approximately 0.1 jag of the Xhol-Kpnl digested pMYP' DNA was 
ligated with 20 pmol of the annealed linker MYP(N454A)FW7RV at 16°C 
25 overnight in a 20 \i\ reaction mixture containing 1 x T4 DNA iigase buffer 

(New England Biolabs, Inc.; Beverly, Massachusetts), 80 units of T4 DNA 
Iigase (New England Biolabs, Inc.; Beverly, Massachusetts). The ligated 
DNA sample was used to transform E.coli strain ER2267. The resulting 
plasmid was named pMYP'(N454A). 



30 



rnMRTRUCTinM OF THE VFfl^T INTEIN PURIFICATION 

YF^T 0 " " MYB 129 
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The yeast intein purification vector employed the chitin-binding 
domain as the affinity tag for affinity purification. Since pMIC (Example 
11) contains the chitin-binding domain and compatible restriction sites 
for direct cloning, the Xhol-BamHI fragment from pMYP , (N454A) was first 
5 transfered into pMIC, replacing the original Xhol-BamHI sequence. On 

the next step, a BamHI-Agel linker insertion was conducted to restore the 
yeast intein C-terminal splicing junction sequence. 



1 p,g of pMYP' (N454A) and 1 jj.g pMIC were digested separately 

10 in a 15 |al reaction mixture containing 1 x BamHI buffer (New England 

Biolabs, Inc.; Beverly, Massachusetts), 0.5 unit Xhol (New England 
Biolabs, Inc.; Beverly, Massachusetts), and 0.5 unit BamHI (New 
England Biolabs, Inc.; Beverly, Massachusetts) at 37°C for2hr. After 
electrophoretic separation on a 1% low melting agarose gel (FMC Corp.; 

15 Rockland, Maine), the Xhol-BamHI fragment from pMYP(N454A) and the 

digested pMIC were excised from the gel. The gel slices were mixed 
and melt at 65°C for 10 min. The mixture was then incubated at 42°C 
for 10 min before 1 unit of B-agarase (New England Biolabs, Inc.; 
Beverly, Massachusetts) was added. After further 1 hr incubation, the 

20 mixture was ready for DNA ligation reaction. The ligation was conducted 

in 1 x ligase buffer (New England Biolabs, Inc.; Beverly, Massachusetts) 
containing 0.5 unit of T4 DNA ligase (New England Biolabs, Inc.; 
Beverly, Massachusetts) at 15°C overnight. 15 m-I of the ligation mixture 
was used to transform E. coli strain ER2267 (New England Biolabs, Inc.; 

25 Beverly, Massachusetts). The resulting construct was pMY-IC. 



1 |ig of pMY-IC was digested at 37°C for 2 hours in 15 jiL of 1x 
BamHI buffer (New England Biolabs, Inc.; Beverly, Massachusetts), 1 
unit of BamHI (New England Biolabs, Inc.; Beverly, Massachusetts) and 

30 1 unit of Agel (New England Biolabs, Inc.; Beverly, Massachusetts). After 

electrophoretic separation on a 1% low melting agarose gel (FMC Corp.; 
Rockland, Maine), the digested pMY-IC plasmid DNA was excised from 
the gel. The gel slices were melt at 65°C for 10 min and then incubated 
at 42°C for 10 min before 1 unit of B-agarase (New England Biolabs, 

35 Inc.; Beverly, Massachusetts) was added. After further 1 hr incubation, 
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the purified pMY-IC digest was ready for DNA ligation reaction. Two 
complementary oligomers, MYB(Bam-Age)FW (5' GATCCCAGGTTGT 
CGTCCATGCATGCGGTGGCCTGA-3' (SEQ ID NO:76)) and MYB(Bam- 
Age)RV (5'-CCGGTCAGGCCTCCGCATGCATGGACGACAACCTGG-3' 
5 (SEQ ID NO:77)) were allowed to anneal to form a double-stranded 

linker, MYB (Bam- Age) FW/RV . 100 pmol of each of the oligomers 
MYB(Bam-Age)FW and MYB(Bam-Age)RV were incubated in 20 jiL of 
1X annealing buffer at 90°C for 4 min and slowly cooled to 37°C. 
Approximately 0.1 ug of the BamHI-Agel digested pMY-IC DNA was 
1 o ligated with 20 pmol annealed linkers at 1 6°C overnight in a 20 u.l 

reaction mixture containing 1x T4 DNA ligase buffer (New England 
Biolabs. Inc.; Beverly, Massachusetts), 80 units of T4 DNA ligase (New 
England Biolabs, Inc.; Beverly, Massachusetts). The ligated DNA 
sample was used to transform E.coli strain ER2267. The resulting 
15 plasmid was named pMYB129 (Figure 21), a sample of which has been 

deposited under the terms and conditions of the Budapest Treaty with 
the American Type Culture Collection on December 28, 1995 and 
received ATCC Accession Number 97398. 

20 ONE STEP PURIFICATION OF THE TARGET PROTEIN I BY 

THE CHEMICAL INDUCIBLE CLEAVAGE ACTIVITY OF THE 
T ^ E n,g "n ,vp<^ from c,^ fl n M yrF,9 CF*EVI$!A£ 

The pMYB129 construct was used to illustrate the one step 
25 purification of a target protein. Here the maltose binding protein is the 

target protein. The E. coli strain ER2267 harboring pMYB129 was 
cultured at 37°C in 1 liter of LB medium supplemented with 100 ng/mL 
ampicillin. The culture was allowed to grow until the OD at 600 nm 
reached 0.7. The induction was conducted by adding IPTG to the final 
30 concentration of 0.4 mM. The induced culture was grown at 30°C for 3 

hr before the cells was harvested by centrifugation at 4000 rpm for 25 
min The cell pellet was resuspended in 50 mL of the column buffer (20 
mM HEPES pH 7.6. 0.5 M NaCI). The cell suspension was sonicated 
for 6 min and then centrifuged at 13,000 rpm for 30 min to give the clear 
35 lysate (around 50 mL). 
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The lysate was directly loaded onto a chitin (Sigma; St. Louis, 
Missouri) and binding was allowed at 4°C for 30 min. (Other preferred 
chitin resins which can be employed are described hereinbelow.) The 
chitin was then washed with 10 volumes of column buffer (20 mM 
HEPES, pH 7.6, 0.5 M NaCI). The column buffer containing 30 mM 
dithiothreitol (DTT) was used to elute the MBP protein (Figure 22A and 
22B). The elution was conducted at 4°C for 1 6 hr. Only the maltose 
binding protein was eluted from the chitin under these conditions (Figure 
22A and 22B). 



MYT fusion protein was purified on a amylose resin (NEB Protein 
fusion and purification system) as described in Example 9. In vitro 
cleavage experiments have shown that 30 mM B-mercaptoethanol (B- 
ME) and 30 mM DTT result in approximately 70% and 90% cleavage of 
15 MYB, respectively (Figure 23A and 23B). 

PREPARATION OF CHITIN BOUND TO SEPHAROSE 4B 

One liter settled bed volume Sepharose 4B (Pharmacia; 

20 Piscataway, New Jersey) (prewashed with 5 volume of water) is 

suspended in 1 liter of 0.3 M NaOH, 1 liter of 1 ( 4-Butanediol diglycidoxy 
ether and 2 grams of sodium borohydride. The suspension is gently 
rocked in a closed container at room temperature for 4 hours. The epoxy 
activated Sepharose 4B beads are washed in a buchner funnel (placed 

25 on a side arm flask equipped with vacuum or aspirator) with 3 liters of 

0.3 M NaOH aqueous solution followed by 6 volumes of deionized 
water until the effluent pH is neutral. After washing, the epoxy activated 
Sepharose 4B beads are suspended in 1 liter of aqueous solution 
containing 40 grams of Sodium meta-periodate. The suspension is 

30 shaken in a closed container at room temperature for 90 minutes. The 

resulting spacer linked aldehyde Sepharose 4B beads are washed with 
3 liter of water in a buchner funnel (vacuum assisted or aspirator). The 
bead paste is added to 1.2 liter of 4% (v/v) aqueous acetic acid solution 
containing 45 grams of chitosan (Pfanstiehl Laboratories; Wauken, 

35 Illinois) and 4 grams of sodium cyanoborohydride. (the chitosan solution 

is prepared by autoclaving the carbohydrate polymer in the 4% (v/v) 
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aqueous acetic acid in an autoclave for one hour). The suspension of 
aldehyde sepharose 4B beads in the chitosan solution is gently rocked 
in a closed container for 18 hours at room temperature. The resulting 
chitosan coupled sepharose 4B is washed in a buchner funnel (vacuum 

5 assisted or aspirator) with 1 0 liters of water. The beads are then washed 

with 1 liter of methanol. The methanol bead paste is suspended into 750 
ml of acetic anhdride and gently rocked in a sealed polyethylene 
container for 18 hours a room temperature. The resulting chitin bound 
bead suspension is transferred to buchner funnel. After removal of 

1 0 acetic anhydride by filtration (vacuum assisted or aspirator). The beads 

are washed with 3 liters of methanol followed by 6 liters of deionized 
water. Test for completion of acetylation is accomplished by using a 
glucosamine standard and the TNBS: perchloric acid assay ( Wilkie, S. 
Landry , D. BioChromatography, 3(5):205-214 (1988), the disclosure of 

15 which is hereby incorporated by reference herein). If amine is detected 

the beads are reacetylated as already described. Finally the beads are 
washed in a buchner funnel with 1 liter of 0.3M NaOH. The chitin beads 
are suspended in 1 liter of 0.3 M NaOH containing 0.5 grams of sodium 
borohydride and gently rocked in a sealed container for 1 8 hours at 

20 room temperature. The beads are washed in a buchner funnel with 6 

liters of deionized water until the pH of the effluent is neutral. The chitin 
bound sepharose beads are stored suspended in 30% methanol/H20 

(v/v). 

25 PREPARATION OF CHITIN BEADS 

The beaded form of chitin is prepared by the solidification 
(precipitation) of chitosan in aqueous solution while the aqueous 
solution is shaped into beaded droplets. Beaded droplets of the 

30 aqueous solution are created by a stirring the aqueous solution with an 

organic water insoluble layer (pentanol) forming an emulsion which is 
stabilized by adding a surfactant or stabiliser (Tween 80). The beads of 
chitosan are formed as the pH is increased and are crosslinked in the 
reaction with the addition of 1 ,4 butanediol diglycidyl ether. The bead 

35 quality such as size and shape is directly affected by concentration and 

length of chitosan, volumes and densities of water and oil layer, shapes 
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and relative dimensions of stirrer and reaction vessel, the amount and 
chemical type of stabiliser and temperature. 

The apparatus with dimensions shown in Figure 24 is set up. To 
5 the reaction vessel is added 1 liter pentanol, 50 ml polyoxy- 

ethylenesorbitan monooleate (Tween 80; Sigma Chemical Co., St. 
Louis, Missouri), and 50 ml 1,4 butanediol diglycidyl ether. The stirring 
solution is equilibrated to 70°C. The stirring shaft is maintained at 300 
rpm. A filtered solution of 7.5 chitosan ( MW = 70,000; Fluka Chemical 

1 0 Co., Ronkonkoma, New York) in 1 liter of 5% acetic acid in water (vrv; 

preheated to 70°C) is added to the stirring solution of pentanol, 
detergent and crosslinker. The emulsion is maintained at 70°C and 100 
ml of 10 M NaOH is added dropwise over a period of 12 minutes. The 
emulsion is allowed to stir at 300 rpm at 70°C for one hour. The stirring 

15 and heating is stopped after one hour and the pentanol layer (top) is 

allowed to separate from the aqueous bead suspension. The top alcohol 
layer is siphoned off from the bottom aqueous layer by an aspirator. 

The aqueous chitosan bead suspension is transferred to a 
20 buchner funnel equipped with an aspirator pump and washed with 5 

liters of water followed by 1.5 liters of methanol. The methanol bead 
paste is transferred to a polyethylene container and suspended in 200 
ml of acetic anhydride. The beads are acetylated in the sealed container 
at room temperature with gentle rocking for 18 hours. The resulting chitin 
25 beads are transferred to a buchner funnel and washed with 2 liters of 

methanol followed by 4 liters of water. Finally the beads are washed with 
1 liter of 0.3 M NaOH. 

The alkaline beads paste is transferred to a polyethylene 
30 container and suspended in 1 liter of 0.3 M NaOH containing 0.5 grams 

of sodium borohydride. The chitin bead suspension is gently rocked in 
the sealed polyethylene container for 18 hours at room temperature. The 
chitin bead suspension is transferred to a buchner funnel and washed 
with 4 liters deionized water or until the pH of the effluent as neutral. The 
35 beads are stored in 500 ml of 30% me lanol water ( v/v). 
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This invention has been described in detail including the preferred 
embodiments thereof. However, it will be appreciated that those skilled 
in the art, upon consideration of this disclosure, may make modifications 
and improvements thereon without departing from the spirit and scope of 
the invention as set forth in the claims. 
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(A) APPLICATION NUMBER : US 08/146,885 
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(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/004,139 

(B) FILING DATE: 09-DEC-1992 
< C ) CLAS S IF IC AT ION : 

(viii) ATTORNEY / AGENT INFORMATION: 

(A) NAME: WILLIAMS, GREGORY D • 

(B) REGISTRATION NUMBER: 30901 

(C) REFERENCE /DOCKET NUMBER: NEB-036C2 

(ix) TELECOMMUNICATION INFORMATION: 
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(C) TELEX: 

<2) INFORMATION FOR SEQ ID NO : 1 : 
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(A) LENGTH: 5837 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: not relevant 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GAATTCGCGA TAAAATCTAT TTTCTTCCTC CATTTTTCAA TTTCAAAAAC GTAAGCATGA 60 
GCCAAACCTC TCGCCCTTTC TCTGTCCTTC CCGCTAACCC TCTTGAAAAC TCTCTCCAAA 120 
GCATTTTTTG ATGAAAGCTC ACGCTCCTCT ATGAGGGTCA GTATATCTGC AATGAGTTCG 180 
TGAAGGGTTA TTCTGTAGAA CAACTCCATG ATTTTCGATT TGGATGGGGG TTTAAAAATT 240 
TGGCGGAACT TTTATTTAAT TTGAACTCCA GTTTATATCT GGTGGTATTT ATGATACTGG 300 
ACACTGATTA CATAACAAAA GATGGCAAGC CTATAATCCG AATTTTTAAG AAAGAGAACG 360 
GGGACTTTAA AATAGAACTT GACCCTCATT TTCAGCCCTA TATATATGCT CTTCTCAAAG 420 
ATGACTCCGC TATTGAGGAG ATAAAGGCAA TAAAGGGCGA GAGACATGGA AAAACTGTGA 480 
GAGTGCTCGA TGCAGTGAAA GTCAGGAAAA AATITTTGGG AAGGGAAGTT GAAGTCTGGA 540 
AGCTCATTTT CGAGCATCCC CAAGACGTTC CAGCTATGCG GGGCAAAATA AGGGAACATC 6O0 
CAGCTGTGGT TGACATTTAC GAATATGACA TACCCTTTGC CAAGCGTTAT CTCATAGACA 660 
AGGGCTTGAT TCCCATGGAG GGAGACGAGG AGCTTAAGCT CCTTGCCTTT GATATTGAAA 720 
CGTTTTATCA TGAGGGAGAT GAATTTGGAA AGGGCGAGAT AATAATGATT AGTTATGCCG 780 
ATGAAGAAGA GGCCAGAGTA ATCACATGGA AAAATATCGA TTTCCCGTAT GTCGATGTTG 840 
TGTOCAATGA AAGAGAAATG ATAAAGCGTT TTGTTCAAGT TGTTAAAGAA AAAGACCCCG 900 
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ATGTGATAAT AACTTACAAT GGGGACAATT TTGATTTGCC GTATCTCATA AAACGGGCAG 960 
AAAAGCTGGG AGTTCGGCTT GTCTTAGGAA GGGACAAAGA ACATCCCGAA CCCAAGATTC 1020 
AGAQGATGGG TGATAGTTTT GCTGTGGAAA TCAAGGGTAG AATOCACTTT GATCTTTTCC 1080 
CAGTTGTGCG AAGGACGATA AACCTCCCAA CGTATACGCT TGAGGCAGTT TATGAAGCAG 1140 
TTTTAGGAAA AACCAAAAGC AAATTAGGAG CAGAGGAAAT TGCCGCTATA TGGGAAACAG 1200 
AAGAAAGCAT GAAAAAACTA GCCCAGTACT CAATGGAAGA TGCTAGGGCA ACGTATGAGC 1260 
TCGGGAAGGA ATTCTTCCCC ATGGAAGCTG AGCTGGCAAA GCTGATAGGT CAAAGTGTAT 1320 
GGGACGTCTC GAGATCAAGC ACCGGCAACC TCGTGGAGTG GTATCTTTTA AGGGTGGCAT 1380 
ACGCGAGGAA TGAACTTGCA CCGAACAAAC CTGATGAGGA AGAGTATAAA CGGCGCTTAA 1440 
GAACAACTTA CCTGGGAGGA TATGTAAAAG AGCCAGAAAA AGGTTTGTGG GAAAATATCA 1500 
TTTATTTGGA TTTCCGCAGT CTGTACCCTT CAATAATAGT TACTCACAAC GTATCCCCAG 1560 
ATACCCTTGA AAAAGAGGGC TGTAAGAATT ACGATGTTGC TCCGATAGTA GGATATAGGT 1620 
TCTGCAAGGA CTTTCCGGGC TTTATTCCCT CCATACTCGG GGACTTAATT GCAATGAGGC 1680 
AAGATATAAA GAAGAAAATG AAATCCACAA TTGACCCGAT CGAAAAGAAA ATGCTCGATT 1740 
ATAGGCAAAG GGCTATTAAA TTGCTTGCAA ACAGCATCTT ACCCAACGAG TGGTTACCAA 1800 
TAATTGAAAA TGGAGAAATA AAATTCGTGA AAATTGGCGA GTTTATAAAC TCTTACATGG 1860 
AAAAACAGAA GGAAAACGTT AAAACAGTAG AGAATACTGA AGTTCTCGAA GTAAACAAOC 1920 
TTTTTGCATT CTCATTCAAC AAAAAAATCA AAGAAAGTGA AGTCAAAAAA GTCAAAGCCC 1980 
TCATAAGACA TAAGTATAAA GGGAAAGCTT ATGAGATTGA GCTTAGCTCT GGTAGAAAAA 2040 
TTAACATAAC TGCTGGCCAT AGTCTGTTTA CAGTTAGAAA TGGAGAAATA AAGGAAGTTT 2100 
CTGGAGATGG GATAAAAGAA GGTGACCTTA TTGTAGCACC AAAGAAAATT AAACTCAATG 2160 
AAAAAGGGGT AAGCATAAAC ATTCCCGAGT TAATCTCAGA TCTTTCCGAG GAAGAAACAG 2220 
CCGACATTGT GATGACGATT TCAGCCAAGG GCAGAAAGAA CTTCTTTAAA GGAATGCTGA 2280 
GAACTTTAAG GTGGATGTTT GGAGAAGAAA ATAGAAGGAT AAGAACATTT AATCGCTATT 2340 
TGTTCCATCT CGAAAAACTA GGCCTTATCA AACTACTGOC CCGCGGATAT GAAGTTACTG 2400 
ACTGGGAGAG ATTAAAGAAA TATAAACAAC TTTACGAGAA GCTTGCTGGA AGCGTTAAGT 2460 
ACAACGGAAA CAAGAGAGAG TATTTAGTAA TGTTCAACGA GATCAAGGAT TTTATATCTT 2520 
ACTTCCCACA AAAAGAGCTC GAAGAATGGA AAATTGGAAC TCTCAATGGC TTTAGAACGA 2580 
ATTGTATTCT CAAAGTCGAT GAGGATTTTG GGAAGCTCCT AGGTTACTAT GTTAGTGAGG 2 640 
GCTATGCAGG TGCACAAAAA AATAAAACTG GTGGTATCAG TTATTCGGTG AAGCTTTAGA 2700 
ATGAGGACCC TAATGTTCTT GAGAGCATGA AAAATGTTGC AGAAAAATTC TTTGGCAAGG 2760 
TTAGAGTTGA CAGAAATTGC GTAAGTATAT CAAAGAAGAT GGCATACTTA GTTATGAAAT 2820 
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GCCTCTGTGG AGCATTAGCC GAAAACAAGA GAATTCCTTC TGTTATACTC ACCICTCCCG 2880 
AACCGGTACG GTGGTCATTT TTAGAGGCGT ATTTTACAGG CGATGGAGAT ATACATCCAT 2940 
CAAAAAGGTT TAGGCTCTCA ACAAAAAGCG AGCTCCTTGC AAATCAGCTT GTGTTCTTGC 3000 
TGAACTCT rr GGGAATATCC TCTGTAAAGA TAGGCTTTGA CAGTGGGGTC TATAGAGTGT 3060 
ATATAAATGA AGACCTGCAA TTTCCACAAA CGTCTAGGGA GAAAAACACA TACTACTCTA 3120 
ACTTAATTCC CAAAGAGATC CTTAGGGACG TGTTTGGAAA AGAGTTCCAA AAGAACATGA 3180 
COTCAAGAA ATTTAAAGAG CTTGTTGACT CTGGAAAACT TAACAGGGAG AAAGCCAAGC 3240 
TCnGGAGTT CTTCATTAAT GGAGATATTG TCCTTGACAG AGTCAAAAGT GTTAAAGAAA 3300 
AGGACTATGA AGGGTATGTC TATGACCTAA GCGTTGAGGA TAACGAGAAC TTTCTTGTTG 3360 
GTTTTGGTTT GCTCTATGCT CACAACAGCT ATTACGGCTA TATGGGGTAT CCTAAGGCAA 3420 
GATGGTACTC GAAGGAATGT GCTGAAAGCG TTACCGCATG GGGGAGACAC TACATAGAGA 3480 
TGACGATAAG AGAAATAGAG GAAAAGTTCG GCTTTAAGGT TCTTTATGCG GACAGTGTCT 3540 
CAGGAGAAAG TGAGATCATA ATAAGGCAAA ACGGAAAGAT TAGATTTGTG AAAATAAAGG 3600 
ATCTTTTCTC TAAGGTGGAC TACAGCATTG GCGAAAAAGA ATACTGCATT CTCGAAGGTG 3660 
TTGAAGCACT AACTCTGGAC GATGACGGAA AGCTTGTCTG GAAGCCCGTC CCCTACGTGA 3720 
TGAGGCACAG AGCGAATAAA AGAATGTTCC GCATCTGGCT GACCAACAGC TGGTATATAG 3780 
ATGTTACTGA GGATCATTCT CTCATAGGCT ATCTAAACAC GTCAAAAACG AAAACTGCCA 3840 
AAAAAATCGG GGAAAGACTA AAGGAAGTAA AGCCTTTTGA ATTAGGCAAA GCAGTAAAAT 3900 
CGCTCATATG CCCAAATGCA CCGTTAAAGG ATGAGAATAC CAAAACTAGC GAAATAGCAG 3960 
TAAAATTCTG GGAGCTCGTA GGATTGATTG TAGGAGATGG AAACTGGGGT GGAGATTCTC 4020 
GTTGGGCAGA GTATTATCTT GGACTTTCAA CAGGCAAAGA TGCAGAAGAG ATAAAGCAAA 4080 
AACTTCTGGA ACCCCTAAAA ACTTATGGAG TAATCTCAAA CTATTACCCA AAAAACGAGA 4140 
AAGGGGACTT CAACATCTTG GCAAAGAGCC TTGTAAAGTT TATGAAAAGG CACTTTAAGG 4200 
ACGAAAAAGG AAGACGAAAA ATTCCAGAGT TCATGTATGA GCTTCCGGTT ACTTACATAG 4260 
^rrXCX ACGAGGACTG TTTTCAGCTG ATGGTACTGT AACTATCAGG AAGGGAGTTC 4320 
C^TCAG GCTAACAAAC ATTGATGCTG ACTTTCTAAG GGAAGTAAGG AAGCTTCTGT 4380 
GGATTGTTGG AATTTCAAAT TCAATATTTG CTGAGACTAC TCCAAATCGC TACAATGGTG 4440 
TTTCTACTGG AACCTACTCA AAGCATCTAA GGATCAAAAA TAAGTGGCGT TTTGCTGAAA 4500 
GGATAGGCTT TTTAATCGAG AGAAAGCAGA AGAGACTTTT AGAACATTTA AAATCAGCGA 4560 
G^^O GAATACCATA GATTTTGGCT TTGATCTTGT GCATGTGAAA AAAGTCGAAG 4620 
AGATACCATA CGAGGGTTAC GTTTATGACA TTGAAGTCG 1 AGAGACGCAT AGCnCT^ 4680 
C^ACAT CCTGGTACAC AATACTGACG GCITTTATC CACAATACCC GGGGAAAAGC 4740 
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CTGAACTCAT TAAAAAGAAA GCCAAGGAAT TCCTAAACTA CATAAACTCC AAACTTCCAG 4800 
GTCTGCTTGA GCTTGAGTAT GAGGGCTTTT ACTTGAGAGG ATTCTTTGTT ACAAAAAAGC 4860 
GCTATGCAGT CATAGATGAA GAGGGCAGGA TAACAACAAG GGGCTTGGAA GTAGTAAGGA 4920 
GAGATTGGAG TGAGATAGCT AAGGAGACTC AGGCAAAGGT TTTAGAGGCT ATACTTAAAG 4980 
AGGGAAGTGT TGAAAAAGCT GTAGAAGTTG TTAGAGATGT TGTAGAGAAA ATAGCAAAAT 5040 
ACAGGGTTCC ACTTGAAAAG CTTGTTATCC ATGAGCAGAT TACCAGGGAT TTAAAGGACT 5100 
ACAAAGCCAT TGGCCCTCAT GTCGCGATAG CAAAAAGACT TGCCGCAAGA GGGATAAAAG 5160 
TGAAACCGGG CACAATAATA AGCTATATCG TTCTCAAAGG GAGCGGAAAG ATAAGCGATA 5220 
GGGTAATTTT ACTTACAGAA TACGATCCTA GAAAACACAA GTACGATCCG GACTACTACA 5280 
TAGAAAACCA AGTTTTGCCG GCAGTACTTA GGATACTCGA AGCGTTTGGA TACAGAAAGG 5340 
AGGATTTAAG GTATCAAAGC TCAAAACAAA CCGGCTTAGA TGCATGGCTC AAGAGGTAGC 5400 
TCTGTTGCTT TTTAGTCCAA GTTTCTCCGC GAGTCTCTCT ATCTCTCTTT TGTATTCTGC 5460 
TATGTGGTTT TCATTCACTA TTAAGTAGTC CGCCAAAGCC ATAACGCTTC CAATTCCAAA 5520 
CTTGAGCTCT TTCCAGTCTC TGGCCTCAAA TTCACTCCAT GTTTTTGGAT CGTCGCTTCT 5580 
CCCTCTTCTG CTAAGCCTCT CGAATCTTTT TCTTGGCGAA GAGTGTACAG CTATGATGAT 5640 
TATCTCTTCC TCTGGAAACG CATCTTTAAA CGTCTGAATT TCATCTAGAG ACCTCACTCC 5700 
GTCGATTATA ACTGCCTTGT ACTTCTTTAG TAGTTCTTTT ACCTTTGGGA TCGTTAATTT 5760 
TGCCACGGCA TTGTCCCCAA GCTCCTGCCT AAGCTGAATG CTCACACTGT TCATACCTTC 5820 
GGGAGTTCTT GGGATCC 5837 

(2) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS; 



(A) I£NGTH: 47 07 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: not relevant 
(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(v) FRAGMENT TYPE: N-terminal 
(ix) FEATURE : 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

GGATCCCTCT CTTTTTGGTA ACCCCATACG TCATTCCCTC AACCAAAACT TCAGCATCGT 60 

TGCAGTGGTC AGTGTGTCTG TGGGAGATGA AGAGGACGTC GATTTTTCTG GGGTCTATCT 120 

TGTATCTCCA CATTCTAACT AACGCTCCAG GCCCAGGATC AACGTAGATG TTTTTGCTCG 180 

CCTTAATGAA GAAGCCACCA GTGGCTCTTG CCTGCGTTAT CGTGACGAAC CTTCCACCAC 240 

CGCCACCGAG AAAAGTTATC TCTATCATCT CACACCTCCC CCATAACATC ACCTGCTCAA 300 



(A) NAME/KEY: CDS 

(B) LOCATION: 363.. 4298 
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TTTTTAAGCG TTCTTAAAGG CTTAAATACG TGAATTTAGC GTAAATTATT GAGGGATTAA 360 

CT ATG ATA CTT GAC GCT GAC TAC ATC ACC GAG GAT GGG AAG CCG ATT 407 
S2 iS Su Asp Ala Asp Tyr He Thr Glu Asp Gly Lys Pro II. 
! 5 10 13 

ATA AGG ATT TTC AAG AAA GAA AAC GGC GAG TTT AAG GTT GAG TAC GAC 455 
S Kg lie Phe Lys Lys Glu Asn Gly Glu Phe Lys Val Glu Tyr Asp 
20 25 

AGA AAC TTT AGA CCT TAC ATT TAC GCT CTC CTC AAA GAT GAC TCG CAG 503 
Axg Asn Phe Arg Pro Tyr He Tyr Ala Leu Leu Lys Asp Asp Ser Gin 
35 40 



ATT GAT GAG GTT AGG AAG ATA ACC GCC GAG AGG CAT GGG AAG ATA GTG 
He Sp Glu Val Arg Lys lie Thr Ala Glu Arg His Gly Lys He Val 
50 55 

AGA ATT ATA GAT GCC GAA AAG GTA AGG AAG AAG TTC CTG GGG AGG CCG 
Arg lie lie Asp Ala Glu Lys Val Arg Lys Lys Phe Leu Gly Arg Pro 
€5 70 75 



551 



599 



ATT GAG GTA TGG AGG CTG TAC TTT GAA CAC CCT CAG GAC GTT CCC GCA 
lie Glu Val Trp Arg Leu Tyr Phe Glu His Pro Gin Asp Val Pro Ala 
80 85 9° 

, rJ - ATA AGA GAG CAT TCC GCA GTT ATT GAC ATC TTT GAG 

S AS Z Ss S Arg Glu Hi. Ser Ala Val He Asp He Phe Glu 
100 105 

TAC GAC ATT CCG TTC GCG AAG AGG TAC CIA ATA GAC AAA GGC CTA ATT 
Tyr Sp lie Pro Phe Ala Lys Arg Tyr Leu He Asp Lys Gly Leu He 

CCA ATG GAA GGC GAT GAA GAG CTC AAG TTG CTC GCA TTT GAC ATA GAA 
Pro Met Glu Gly Asp Glu Glu Leu Lys Leu Leu Ala Phe Asp He Glu 
130 135 140 

ACC CTC TAT CAC GAA GGG GAG GAG TTC GCG AAG GGG CCC ATT ATA ATG 
Su Tyr His Glu Gly Glu Glu Phe Ala Lys Gly Pro He He Met 
145 150 155 

in* MP TAT GCT GAT GAG GAA GAA GCC AAA GTC ATA ACQ TGG AAA AAG 
S sfr Tyr S Z Glu «« «« Ala Lys Val He Thr Trp Lys Lys 
160 165 

ATC GAT CTC CCG TAC GTC GAG GTA GTT TCC AGC GAG AGG GAG ATG MA 
iS Sp Su Pro Tyr Val Glu Val Val Ser Ser Glu Arg Glu Met He 

180 1Bb 

S S Z £ S S 2 £ E 5 S 

210 215 

_ „ ATA CTA CCC CTG GGA AGG GAC GGT AGT GAG CCA 

Ss Su S? lie Ss 2u Pro Leu Gly Arg Asp Gly Ser Glu Pro 

225 230 



647 



695 



743 



791 



839 



887 



935 



983 



1031 



1079 
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AAG ATG CAG AGG CTT GGG GAT ATG ACA GCG GTG GAG ATA AAG GGA AGG 1127 
Lya Met Gin Arg Leu Gly Asp Met Thr Ala Val Glu He Lys Gly Arg 
240 245 250 255 

ATA CAC TTT GAC CTC TAC CAC GTG ATT AGG AGA ACG ATA AAC CTC CCA 1175 
He His Phe Asp Leu Tyr His Val He Arg Arg Thr He Asn Leu Pro 
260 265 270 

ACA TAC ACC CTC GAG GCA GTT TAT GAG GGA ATC TTC GGA AAG CCA AAG 1223 
Thr Tyr Thr Leu Glu Ala Val Tyr Glu Ala He Phe Gly Lys Pro Lya 
275 280 285 

GAG AAA GTT TAC GOT CAC GAG ATA GCT GAG GCC TGG GAG ACT GGA AAG 1271 
Glu Lys Val Tyr Ala His Glu He Ala Glu Ala Trp Glu Thr Gly Lya 
290 295 300 

GGA CTG GAG AGA GTT GCA AAG TAT TCA ATG GAG GAT GCA AAG GTA ACG 1319 
Gly Leu Glu Arg Val Ala Lys Tyr Ser Met Glu Asp Ala Lys Val Thr 
305 310 315 

TAC GAG CTC GGT AGG GAG TTC TTC CCA ATG GAG GCC CAG CTT TCA AGG 1367 
Tyr Glu Leu Gly Arg Glu Phe Phe Pro Met Glu Ala Gin Leu Ser Arg 
320 325 330 335 

TTA GTC GGC CAG CCC CTG TGG GAT GTT TCT AGG TCT TCA ACT GGC AAC 1415 
Leu Val Gly Gin Pro Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn 
340 345 350 

TTG GTG GAG TGG TAC CTC CTC AGG AAG GCC TAC GAG AGG AAT GAA TTG 1463 
Leu Val Glu Trp Tyr Leu Leu Arg Lys Ala Tyr Glu Arg Asn Glu Leu 
355 360 365 

GCT CCA AAC AAG CCG GAT GAG AGG GAG TAC GAG AGA AGG CTA AGG GAG 1511 
Ala Pro Asn Lys Pro Asp Glu Arg Glu Tyr Glu Arg Arg Leu Arg Glu 
370 375 380 

AGC TAC GCT GGG GGA TAC GTT AAG GAG CCG GAG AAA GGG CTC TGG GAG 1559 
Ser Tyr Ala Gly Gly Tyr Val Lys Glu Pro Glu Lys Gly Leu Trp Glu 
385 390 395 

GGG TTA GTT TCC CTA GAT TTC AGG AGC CTG TAC CCC TCG ATA ATA ATC 1607 
Gly Leu Val Ser Leu Asp Phe Arg Ser Leu Tyr Pro Ser He He lie 
400 405 410 415 

ACC CAT AAC GTC TCA CCG GAT ACG CTG AAC AGG GAA GGG TGT AGG GAA 1655 
Thr His Asn Val Ser Pro Asp Thr Leu Asn Arg Glu Gly Cys Arg Glu 
420 425 430 

TAC GAT GTC GCC CCA GAG GTT GGG CAC AAG TTC TGC AAG GAC TTC CCG 1703 
Tyr Asp Val Ala Pro Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro 
435 440 445 

GGG TTT ATC CCC AGC CTG CTC AAG AGG TTA TTG GAT GAA AGG CAA GAA 1751 
Gly Phe He Pro Ser Leu Leu Lys Arg Leu Leu Asp Glu Arg Gin Glu 
450 455 460 

ATA AAA AGG AAG ATG AAA GCT TCT AAA GAC CCA ATC GAG AAG AAG ATG 17 9 9 
He Lys Arg Lys Met Lys Ala Ser Lys Asp Pro He Glu Lys Lys Met 
465 470 475 

CTT GAT TAC AGG CAA CGG GCA ATC AAA ATC CTG GCA AAC AGC ATT TTA 184 7 
Leu Asp Tyr Arg Gin Arg Ala He Lys He Leu Ala Asn Ser He Leu 
480 485 490 495 
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TTA AAC TCT GGT AGA AAA ATA ALA aim — * — ~ 

Su IS Gly Arg Lys He Thr lie Thr Glu Gly Hi. Ser Leu Phe 



CCG GAA GAA TGG GTT CCA CTA ATT AAA AAC GGT AAA GTT AAG ATA TTC 
£o Glu Glu Trp Val Pro Leu lie Lys Asn Gly Lys Val Lys lie Phe 
500 505 

CGC ATT GGG GAC TTC GTT GAT GGA CTT ATG AAG GCG AAC CAA GGA AAA 
S S S Asp Phe Val Asp Gly Leu Met Lys Ala Asn Gin Gly Lys 

GTG AAG AAA ACG GGG GAT ACA GAA GTT TTA GAA GTT GCA GGA ATT CAT 
S5 Ly3 Tnr Sy Asp Thr Glu Val L.u Glu Val Ala Gly He Hi. 
530 535 5 

GOG TTT TCC TTT GAC AGG AAG TCC AAG AAG GCC CGT GTA ATG GCA GTG 
S 2r P^ Asp Arc, Lys Ser Lys Lys Ala Arg Val Met Ala Val 

545 550 555 

AAA GCC GTG ATA AGA CAC CGT TAT TCC GGA AAT GTT TAT AGA ATA GTC 
Ala Val lie Arg His Arg Tyr Ser Gly Asn Val Tyr Arg He Val 
560 565 

TTA AAC TCT GGT AGA AAA ATA ACA ATA ACA GAA GGG CAT AGC CTA TTT 
Arg 
580 

GTC TAT AGG AAC GGG GAT CTC GTT GAG GCA ACT GGG GAG GAT GTC AAA 
t£ Art Asn Gly Asp Leu Val Glu Ala Thr Gly Glu Asp Val Lys 
595 600 

ATT GGG GAT CTT CTT GCA GTT CCA AGA TCA GTA AAC CTA CCA GAG AAA 
lie GLy Asp Leu Leu Ala Val Pro Arg Ser Val Asn Leu Pro Glu Lys 

610 615 

»«n rnr TTG AAT ATT GTT GAA CTT CTT CTG AAT CTC TCA CCG GAA 

S £5 £ Su A^n S S Glu L.u L« Hu Asn I.U Ser Pro Glu 

625 630 635 

^ na rAT ATA ATA CTT ACG ATT CCA GTT AAA GGC AGA AAG AAC 

tS Su X Ue S to Thr lie Pro Val Lys Gly Arg Lys Asn 
640 645 650 

660 665 

S5SS 5£ S S S S 2 5 - S S S= SS 

675 680 
690 695 



1895 



1943 



1991 



2039 



2087 



2135 



2183 



2231 



2279 



2327 



2375 



2423 



2471 



705 710 

720 725 
740 



2519 



2567 



2615 
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ATT GGA ACT AGA AAT GGA TTC AGA ATG GGT ACG TTC GTA GAT ATT GAT 2663 
lie Gly Thr Arg Asn Gly Phe Arg Met Gly Thr Phe Val Asp He Asp 
755 760 765 

GAA GAT TTT GCC AAG CTT CTT GGC TAC TAT GTG AGC GAG GGA AGT GCG 2711 
Glu Asp Phe Ala Lys Leu Leu Gly Tyr Tyr Val Ser Glu Gly Ser Ala 
770 775 780 

AGG AAG TGG AAG AAT CAA ACT GGA GGT TGG AGT TAC ACT GTG AGA TTG 2759 
Arg Lys Trp Lys Asn Gin Thr Gly Gly Trp Ser Tyr Thr Val Arg Leu 
785 790 795 

TAC AAC GAG AAC GAT GAA GTT CTT GAC GAC ATG GAA CAC TTA GCC AAG 2807 
Tyr Asn Glu Asn Asp Glu Val Leu Asp Asp Met Glu His Leu Ala Lys 
800 805 810 815 

AAG TTT TTT GGG AAA GTC AAA CGT GGA AAG AAC TAT GTT GAG ATA CCA 2855 
Lys Phe Phe Gly Lys Val Lys Arg Gly Lys Asn Tyr Val Glu He Pro 
820 825 830 

AAG AAA ATG GCT TAT ATC ATC TTT GAG AGC CTT TGT GGG ACT TTG GCA 2903 
Lys Lys Met Ala Tyr He He Phe Glu Ser Leu Cys Gly Thr Leu Ala 
835 840 845 

GAA AAC AAA AGG GTT CCT GAG GTA ATC TTT ACC TCA TCA AAG GGC GTT 2951 
Glu Asn Lys Arg Val Pro Glu Val He Phe Thr Ser Ser Lys Gly Val 
850 855 860 

AGA TGG GCC TTC CTT GAG GGT TAT TTC ATC GGC GAT GGC GAT GTT CAC 2999 
Arg Trp Ala Phe Leu Glu Gly Tyr Phe He Gly Asp Gly Asp Val His 
865 870 875 

CCA AGC AAG AGG GTT CGC CTA TCA ACG AAG AGC GAG CTT TTA GTA AAT 3047 
Pro Ser Lys Arg Val Arg Leu Ser Thr Lys Ser Glu Leu Leu Val Asn 
880 885 890 895 

GGC CTT GTT CTC CTA CTT AAC TCC CTT GGA GTA TCT GCC ATT AAG CTT 3095 
Gly Leu Val Leu Leu Leu Asn Ser Leu Gly Val Ser Ala He Lys Leu 
900 905 910 

GGA TAC GAT AGC GGA GTC TAC AGG GTT TAT GTA AAC GAG GAA CTT AAG 3143 
Gly Tyr Asp Ser Gly Val Tyr Arg Val Tyr Val Asn Glu Glu Leu Lys 
915 920 925 

TTT ACG GAA TAC AGA AAG AAA AAG AAT GTA TAT CAC TCT CAC ATT GTT 3191 
Phe Thr Glu Tyr Arg Lys Lys Lys Asn Val Tyr His Ser His He Val 
930 935 940 

CCA AAG GAT ATT CTC AAA GAA ACT TTT GGT AAG GTC TTC CAG AAA AAT 3239 
Pro Lys Asp He Leu Lys Glu Thr Phe Gly Lys Val Phe Gin Lys Asn 
945 950 955 

ATA AGT TAC AAG AAA TTT AGA GAG CTT GTA GAA AAT GGA AAA CTT GAC 3287 
He Ser Tyr Lys Lys Phe Arg Glu Leu Val Glu Asn Gly Lys Leu Asp 
960 965 970 975 

AGG GAG AAA GCC AAA CGC ATT GAG TGG TTA CTT AAC GGA GAT ATA GTC 3335 
Arg Glu Lys Ala Lys Arg He Glu Trp Leu Leu Asn Gly Asp He Val 
980 985 990 

CTA GAT AGA GTC GTA GAG ATT AAG AGA GAG T?" TAT GAT GGT TAC GTT 3383 
Leu Asp Arg Val Val Glu He Lys Arg Glu T Tyr Asp Gly Tyr Val 
995 1000 1005 
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TAG GAT CTA AGT GTC GAT GAA GAT GAG AAT TTC CTT GCT GGC TTT GGA 3431 
Tvr Asp Leu Ser Val Asp Glu Asp Glu Asn Phe Leu Ala Gly Phe Gly 
1010 1015 1020 

TTC CTC TAT GCA CAT AAT AGC TAT TAT GGG TAT TAT GGG TAC GCA AAA 3479 
Phe Leu Tyr Ala His Asn Ser Tyr Tyr Gly Tyr Tyr Gly Tyr Ala Lys 
1025 1030 1035 

GCC CGT TGG TAC TGT AAG GAG TGC GCA GAG AGC GTT ACG GCC TGG GGG 3527 
Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser Val Thr Ala Trp Gly 
10 40 1045 1050 1055 

AGG GAA TAT ATA GAG TTC GTA AGG AAG GAA CTG GAG GAA AAG TTC GGG 3575 
Ara Glu Tyr lie Glu Phe Val Arg Lys Glu Leu Glu Glu Lys Phe Gly 
* 1060 1065 1070 

TTC AAA GTC TTA TAC ATA GAC ACA GAT GGA CTC TAC GCC ACA ATT CCT 3623 
Phe Lvs Val Leu Tyr He Asp Thr Asp Gly Leu Tyr Ala Thr He Pro 
1075 1080 1085 

GGG GCA AAA CCC GAG GAG ATA AAG AAG AAA GCC CTA GAG TTC GTA GAT 3671 
Glv Ala Lys Pro Glu Glu He Lys Lys Lys Ala Leu Glu Phe Val Asp 
1090 1095 HOO 

TAT ATA AAC GCC AAG CTC CCA GGG CTG TTG GAG CTT GAG TAC GAG GGC 3719 
Tyr He Asn Ala Lys Leu Pro Gly Leu Leu Glu Leu Glu Tyr Glu Gly 
1105 H10 1115 

TTC TAC GTG AGA GGG TTC TTC GTG ACG AAG AAG AAG TAT GCG TTG ATA 3767 
Phe Tvr Val Arg Gly Phe Phe Val Thr Lys Lys Lys Tyr Ala Leu He 
U20 H25 H30 H35 

GAT GAG GAA GGG AAG ATA ATC ACT AGG GGG CTT GAA ATA GTC AGG AGG 3815 
Asp Glu Glu Gly Lys He He Thr Arg Gly Leu Glu He Val Arg Arg 
- - -~ 1150 



1140 



GAC TGG AGC GAA ATA GCC AAA GAA ACC CAA GCA AAA GTC CTA GAG GCT 3863 
Asp Trp Ser Glu He Ala Lys Glu Thr Gin Ala Lys Val Leu Glu Ala 
1155 H60 H65 

ATC CTA AAG CAT GGC AAC GTT GAG GAG GCA GTA AAG ATA GTT AAG GAG 3911 
He Leu Lys His Gly Asn Val Glu Glu Ala Val Lys He Val Lys Glu 
1110 H75 neo 

GTA ACT GAA AAG CTG AGC AAG TAC GAA ATA CCT CCA GAA AAG CTA GTT 3959 
Val Thr Glu Lys Leu Ser Lys Tyr Glu He Pro Pro Glu Lys Leu Val 
U85 1195 

ATT TAC GAG CAG ATC ACG AGG CCC CTT CAC GAG TAC AAG GCT ATA GGT 4007 
He Tyr Glu Gin He Thr Arg Pro Leu His Glu Tyr Lys Ala He Gly 
1200 1205 1210 1215 



CCG CAC GTT GCC GTG GCA AAA AGG TTA GCC GCT AGA GGA GTA AAG GTG 
Pro His Val Ala Val Ala Lys Arg Leu Ala Ala Arg Gly Val Lys Val 
1225 1230 



4055 



1220 



AGG CCT GGC ATG GTG ATA GGG TAC ATA GTG CTG AGG GGA GAC GGG CCA 4103 
S Pro Gly Met Val He Gly Tyr lie Val I*u Arg Gly Asp Gly Pro 
y 12 35 1240 1245 

ATA AGC AAG AGG GCT ATC CTT GCA GAG GAG TTC GAT CTC AGG AAG CAT 4151 
He Ser Lys Arg Ala He Leu Ala Glu Glu Phe Asp Leu Arg Lys Hxs 
1250 1255 1260 
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AAG TAT GAC GOT GAG TAT TAC ATA GAA AAT CAG GTT TTA CCT GCC GTT 4199 
Lys Tyr Asp Ala Glu Tyr Tyr He Glu Asn Gin Val Leu Pro Ala Val 
1265 1270 1275 

CTT AGA ATA TTA GAG GCC TTT GGG TAC AGG AAA GAA GAC CTC AGG TGG 4247 
Leu Arg He Leu Glu Ala Phe Gly Tyr Arg Lys Glu Asp Leu Arg Trp 
1280 1285 1290 1295 

CAG AAG ACT AAA CAG ACA GGT CTT ACG GCA TGG CTT AAC ATC AAG AAG 4295 
Gin Lys Thr Lys Gin Thr Gly Leu Thr Ala Trp Leu Asn He Lys Lys 
1300 1305 1310 

AAG TAATGTTTAT GTACTCGTAA TGCGAGTATT AAGTGGGTGA TGAGATGGCA 4348 
Lys 

GTATTGAGCA TAAGGATTCC GGATGATCTA AAAGAGAAGA TGAAGGAGTT TGACATAAAC 4408 
TGGAGTGAGG AGATCAGGAA GTTCATAAAA GAGAGGATAG AGTATGAGGA AAGGAAGAGA 4468 
ACCCTTGAGA AAGCTCTAGA ACTTCTAAAG AATACTCCAG GATCAGTCGA GAGAGGATTT 4528 
TCAGCAAGGG CAGTGAGGGA GGATCGTGAT AGTCATTGAT GCATCAATCC TAGCTAAAAT 4588 
AATTCTAAAA GAAGAGGGCT GGGAACAGAT AACTCTTACA CCGAGCACGA TAACTTTGGA 4648 
CTATGCTTTT GTTGAATGTA CAAACGCAAT ATGGAAGGCT GTCAGGCGGA ACAGGATCC 4707 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
AGTGTCTCCG GAGAAAGTGA GAT 23 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
AGTATTGTGT ACCAGGATGT TG 22 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 
AGCATTTTAC CGGAAGAATG GGTT 24 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
GCTATTATGT GCATAGAGGA AT CCA 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
AGGGTCGACA GATTTGATCC AGCG 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GAGAACTTTG TTCGTACCTG 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GGTATTATTT CTTCTAAAGC A 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

20 

GTTGTTTGTT GGTTTTACCA 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 20 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



ATGGCAAATG CTGTATGGAT 



20 



(2) INFORMATION FOR SEQ ID NO: 12: 

U) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
AGTGTCTCCG GAGAAAGTGA GAT 23 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATTGTGTACT AGTATGTTGT TTGCAA 26 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
GCCTCCGGAG ACACTATCGC CAAAATCACC GCCGTAA 37 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GCCACTAGTA CACAATACGC CGAACGATCG CCAGTTCT 38 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
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CCTTCTAGAC CGGTGCAGTA TGAAGG 
(2) INFORMATION FOR SEQ ID NO: 17: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

GCCGTICGACC CTAGTGTCTC AGGAGAAAGT GAGATC 

(2) INFORMATION FOR SEQ ID NO: 18: 

(1) SEQUENCE CHARACTERISTICS: 
<A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GCCTCTAGAA TTGTGTACCA GGATGTTGTT TGC 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
GCAAAGAACC GGTGCGTCTC TTC 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
AGCAACAGAG TTACCTCTTG 
(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CAGTTTCCAG CTCCTACAAT GAGACCTACG AGC 

(2) INFORMATION FOR SEQ ID NO: 22: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
GTAGTGTCGA CCCCATGCGG 20 
(2) INFORMATION FOR SEQ ID NO:23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
CGTTTTGCCT GATTATTATC TCACTTTC 28 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
GTCCACCTTC GAAAAAAGAT CC 22 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25: 
CCGCATAAAG GACCTTAAAG C 21 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

GAGGAAGAGA TCATCATCAT AGC 23 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 
GTCCTTCGTG CGGACAGTGT CTCAGGAGAA AGTGAGATAA 40 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
GTCCTTTATG CGGACTAGGT CTCAGGAGAA AGTGAGATAA 40 
(2) INFORMATION FOR SEQ ID NO:29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:29: 
CCGGTTCTTT GCAAACAACA TCCTGGTACA CAATTAAGAC GGCTTTTATG CCACAATACC 60 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

He Lys He Leu Ala Asn Ser He Leu Pro Glu Glu Trp Val Pro Leu 
1 5 10 15 

He Lys Asn Gly Lys Val 
20 

(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 arnino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

He Lys Leu Leu Ala Asn Ser lie Leu Pro Asn Glu Trp Leu Pro 
1 5 10 15 

He He Glu Asn Gly Glu He 

20 
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(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Ly3 Val Leu Tyr Ala Asp Ser Val Ser Gly Glu Ser Glu lie lie lie 
15 10 15 

Arg Gin Asn Gly Lys He 
20 

<2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

Ala He Leu Tyr Val Gly Cys Gly Ala Lys Gly Thr Asn Val Leu Met 
15 10 15 

Ala Asp Gly Ser He Glu 

20 

(2) INFORMATION FOR SEQ ID NO: 34: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

Lys Val Val Lys Asn Lys Cys Leu Ala Glu Gly Thr Axg He Arg Asp 
15 10 15 

Pro Val Thr Gly Thr Thr 

20 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Glu Asx Gly Lys Ala Gly Phe Gly Phe Leu Tyr Ala His Asn Ser Tyr 
15 10 15 

Tyr Gly Tyr Tyr Gly Tyr Ala 
20 

(2) INFORMATION FOR SEQ ID NO: 36: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

Glu Asn Phe Leu Val Gly Phe Gly Leu Leu Tyr Ala His Asn Ser Tyr 
1 5 10 15 

Tyr Gly Tyr Met Gly Tyr Pro 
20 

(2) INFORMATION FOR SEQ ID NO:3"7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO:3-7: 

Glu Thr His Arg Phe Phe Ala Asn Asn lie Leu Val His Asn Thr Asp 
1 5 10 15 

Gly Phe Tyr Ala Thr lie Pro 
20 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 

Asp His Gin Phe Leu Leu Ala Asn Gin Val Val Val His Asn Cys Gly 
I 5 10 15 

Glu Arg Gly Asn Glu Met Ala 
20 

(2) INFORMATION FOP SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY : unknown 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

Glu Leu His Thr Leu Val Ala Glu Gly Val Val Val His Asn Cys Ser 
I 5 10 I 5 

Pro Pro Phe Lys Gin Ala Glu 
20 

(2) INFORMATION FOR SEQ ID NO: 40: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GCAATTATGT GCATAGAGGA AT CCA 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GGTATTATGT GCATAGAGGA AT CCA 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
ATTATGTGCA TAGAGGAATC CAAAG 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
GGTACCCGTC GTGCTAGCAT TTTACCGGAA GAATGGGTAC CA 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44 
CCCGCTATTA TGTGCATAGA GGGATCC 
(2) INFORMATION FOR SEQ ID NO: 45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 
(ix) FEATURE 

(A) NAME/KEY: peptide 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /note* M Xaa at position 1 - (Ala/Val) " 
(ix) FEATURE 

(A) NAME/KEY: peptide 

(B) LOCATION: 4 

(D) OTHER INFORMATION: /note- "Xaa at position 4 - (Ser/Cys/Tnr) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 

Xaa His Asn Xaa 4 
1 



(2) INFORMATION FOR SEQ ID NO: 46 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46 
GATCCCTCTA TGCACATAAT TCAGGCCTC 
(2) INFORMATION FOR SEQ ID NO: 47 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
AATTGAGGCC TGAATTATGT GCATAGAGG 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

GCTCGAGGCT AGCATTTTAC CGGAAGAATG GGTAC 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



CCATTCTTCC GGTAAAATGC TAGCCTCGAG CGTAC 
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(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
<D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

GATCCCTCTA TAAGCATAAT TCAGG 25 

(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(D> TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CCTGAATTAT GCTTATAGAG G 21 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
GATCCCTCTA TGCACTGAAT TCAGG 25 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
CCTGAATTCA GTGCATAGAG G 21 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:54: 

GTCAGGCCTC T CAGACAGT A CAGCTCGTAC AT 32 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
AGGOCT 

(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56: 
CCCCTGCAGT TAAAAGTAAT TGCTTTCCAA ATAAG 
(2) INFORMATION FOR SEQ ID NO: 57: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 
CTGCAG 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECUI£ TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
GGAATTCCAT ATGAAAATCG AAGAAGGT 
(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
CGGGATCCCG TTATAGTGAG ATAACGTCCC G 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: unknown 
(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

GGAATTCCAT ATGCCAGAGG AAGAACTG 28 

(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRAN DEDNE SS : single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
ATAGTTTAGC GGCCGCTCAC GACGTTGTAA AACG 34 
(2) INFORMATION FOR SEQ ID NO: 62: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
TCGAGGCTAG CAAATTACCG GAAGAATGGG TAC 33 
(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63 : 
CCATTCTTCC GGTAATTTGC TAGCC 25 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 3 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
TCGAGGCTTG CATTTTACCG GAAGAATGGG TAC 33 
(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
CCATTCTTCC GGTAAAATGC AAGCC 
(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE : Other Nucleic Acid 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
GATCCCTCTA TAAGCATAAT ATTGGCATGC ACTA 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE : Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67 : 
TACTGCATGC CAATATTATG CTTATAGAGG 
(2) INFORMATION FOP SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE : Other Nucleic Acid 

{xi ) SEQUENCE DESCRIPTION : SEQ ID NO: 68: 
GATCCCTCTA TGCACATAAT TAAGGCATG 
(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

( xi) SEQUENCE DESCRIPTION : SEQ ID NO: 69: 
CCTTAATTAT GT GOAT AG AG G 
(2) INFORMATION FOR SEQ ID NO: 70: 

m SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 33 base pairs 
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(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 70: 
GCGCTCGAGG GGTGCTTTGC CAAGGGTACC AAT 33 
(2) INFORMATION FOR SEQ ID NO: 71: 

<i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
CCTCCGCAAT TATGGACGAC AACCTGGT 28 
(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
GAATGCGGAA TTCAGGCCTC CGCA 24 
(2) INFORMATION FOR SEQ ID NO: 73: 

(1) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

ATGGACGACA ACCTGGGATC CAAGCAAAAA CTGATGATC 39 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 35 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE : Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 74: 
GATCCCAGGT TGTCGTCCAT GCATGCGGAG GCCTG 35 
(2) INFORMATION FOR SEQ ID NO: 7 5 : 

(1) SEQUENCE C HAFACTER I ST ICS : 
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(A) LENGTH: 35 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
AATTCAGGCC TCCGCATGCA TGGACGACAA CCTGG 
(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE : Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
GATCCCAGGT TGTCGTCCAT GCATGCGGTG GCCTGA 
(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: Other Nucleic Acid 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:77: 
CCGGTCAGGC CTCCGCATGC ATGGACGACA ACCTGG 
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WHAT IS CLAIMED IS: 



1. A modified protein comprising a target protein and a controllable 
intervening protein sequence, wherein said controllable 
intervening protein sequence is capable of excision or cleavage 
under predetermined conditions in cis or in trans. 

2. The modified protein of claim 1, wherein the controllable 
intervening protein is inserted into the target protein. 

3. The modified protein of claim 1 , wherein the controllable 
intervening protein sequence is inserted into a region of the target 
protein such that the target protein is rendered substantially 
inactive. 

4. The modified protein of claim 3, wherein upon excision of the 
controllable intervening protein sequence the activity of the target 
protein is substantially restored. 

5. The modified protein of claim 1, wherein the controllable 
intervening protein sequence is fused to the target protein. 

6. The modified protein of claim 5, wherein the controllable 
intervening protein sequence is fused at C-terminal or N-terminal 
of the target protein. 

7. The modified protein of claim 1 , wherein the controllable 
intervening protein sequence encodes an endonuclease having 
homology to a homing endonuclease. 

8. The modified protein of claim 7, wherein the endonuclease 
function of the controllable intervening protein sequence has been 
substantially inactivated. 
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9. The modified protein of claim 1 , wherein the controllable 
intervening protein sequence and the target protein form a splice 
junction. 

10. The modified protein of claim 9, wherein the splice junction 
comprises at the C-terminal side amino acid residues having -OH 
or -SH side chains for splicing. 

1 1 . The modified protein of claim 10, wherein the splice junction 
comprises at the 3' end of the controllable intervening protein 
sequence a His-Asn dipeptide. 

12. The modified protein of claim 1 1 , wherein the controllable 
intervening protein sequence is selected from the group consisting 
of controllable intervening protein sequence 1 , 2, or 3. 

13. The modified protein of claim 12, wherein the controllable 
intervening protein sequence is inserted immediately before a 
serine, threonine or cysteine residue of the target protein. 

14. The modified protein of claim 12, wherein the controllable 
intervening protein sequence contains a serine, threonine or 
cysteine residue at its 5' end. 

15. The modified protein of claim 10, wherein at least one residue 
having an -OH or -SH side chain is modified such that cleavage is 
reduced. 

16. The modified protein of claim 15, wherein the modification is an 
amino acid substitution. 

17. The modified protein of claim 1 5, wherein the modification is a 
post-translational or co-translational chemical derivatization of the 
side chain. 
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18. The modified protein of claim 16, wherein the amino acid 
substitution, substitutes at least one of the amino acids involved in 
the splicing reaction with a derivative in which the functionality of 
the side chain is masked by a removable group. 

1 9. The modified protein of claim 1 7, wherein at least one of the amino 
acids involved in the splicing reaction is chemically derivatized 
such that the functionality of the side chain is masked by a 
removable group. 

20. The modified protein of claim 18, wherein the removable group is 
chemically or photolytically removable. 

21. The modified protein of claim 1, wherein the predetermined 
condition is selected from the group consisting of increase in 
temperature, addition of a chemical reagent which facilitates 
splicing or cleavage, change in pH and exposure to light. 

22. The modified protein of claim 1, wherein the controllable 
intervening sequence is derived from Saccharomyces. 

23. A method of producing a modified protein which comprises: 

(a) joining a DNA encoding a controllable intervening protein 
sequence with a DNA encoding a target protein to form a 
fusion DNA; 

(b) expressing said fusion DNA to produce the modified target 
protein. 

24. The method of claim 24, wherein the DNA encoding the 
controllable intervening protein sequence is joined with the DNA 
encoding the target protein by inserting it into the DNA encoding 
the target protein. 
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25. The method of claim 24, wherein the controllable intervening 
protein sequence DNA is inserted at a site appropriate for 
substantially decreasing the activity of the target protein. 

26. The method of claim 25 wherein upon excision of the controllable 
intervening protein sequence the activity of the target protein is 
substantially restored. 

27. The method of claim 23, wherein the DNA encoding the 
controllable intervening protein sequence is joined with the DNA 
encoding the target protein by fusing it to the DNA encoding the 
target protein. 

28. The method of claim 23, wherein the controllable intervening 
protein sequence encodes for an endonuclease having homology 
to a homing endonuclease. 

29. The method of claim 28, wherein the endonuclease function of the 
controllable intervening protein sequence has been substantially 
inactivated. 

30. The method of claim 28, wherein the controllable intervening 
protein sequence is selected from the group consisting of 
controllable intervening protein sequence 1, 2, or 3. 

31 The method of claim 30, wherein the controllable intervening 
protein sequence is inserted immediately before a serine, 
threonine or cysteine residue of the target protein. 

32. The method claim 30, wherein the controllable intervening protein 
sequence contains a serine, threonine or cysteine residue at its 5' 
end. 

33. A method of producing a protein comprising: 



WO 97/01642 




POYUS96/10545 



(a) inserting a DNA encoding a controllable intervening protein 
sequence into a DNA encoding a target protein, wherein 
said controllable intervening protein sequence is capable of 
excision under predetermined conditions; 

(b) expressing the DNA of step (a) to produce a modified target 
protein; and 

(c) subjecting the modified target protein to conditions under 
which said controllable intervening protein sequence will 
undergo excision. 

34. A method of producing a protein comprising: 

(a) producing a first modified protein comprising an amino 
portion of a target protein into which is inserted at its 
carboxy terminus a controllable intervening protein 
sequence; 

(b) producing a second modified protein comprising the 
remaining portion of the target protein of step (a) into which 
is inserted at its amino terminus a controllable intervening 
protein sequence; and 

(c) placing the first and second modified proteins under 
predetermined conditions appropriate for splicing of the 
controllable intervening protein sequence. 

35. The method of claim 34, wherein the controllable intervening 
protein sequence inserted at the carboxy terminus of the target 
protein comprises an amino terminal fragment of the controllable 
intervening protein sequence and the controllable intervening 
protein sequence inserted at the amino terminus of the remaining 
portion of the target protein comprises the remaining fragment of 
the controllable intervening protein sequence. 

36. A method of producing a protein comprising: 

(a) producing a first modified protein comprising a portion of a 
target protein into which a controllable intervening protein 
sequence is inserted at a splice junction; and 
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(b) placing the first modified protein with the remaining portion 
of the target protein under predetermined conditions 
appropriate for splicing of the controllable intervening 
protein sequence. 

37 a method for purification of a target protein comprising: 

(a) forming a fusion protein comprising a controllable 
intervening protein sequence between a target protein and 
a binding protein having affinity for a substrate; 

(b) contacting the fusion protein with substrate to which the 
binding protein binds; 

(c) subjecting the substrate bound fusion protein to conditions 
under which cleavage of the controllable intervening protein 
sequence occurs, thus separating the target protein from the 
binding protein; and 

(d) recovering the target protein. 

38. The method of claim 37, wherein the substrate is contained within 
an affinity column. 

39. The method of claim 37, wherein the binding protein is selected 
from the group consisting of sugar binding protein, chitin binding 
protein, receptor protein, amino acid binding protein, sulfate 
binding protein, vitamin binding protein, metal binding protein, 
phosphate binding protein, lectin binding protein or nucleic acid 
binding protein. 

40. A method for purification of a target protein comprising: 

(a) forming a fusion protein comprising a controllable 
intervening protein sequence and a target protein; 

(b) c: tacting a fusion protein with a substrate to which the 
controllable intervening protein sequence binds; 

(c) subjecting the substrate bound fusion protein to conditions 
unde r which cleavage of the controllable intervening protein 
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sequence occurs; thus separating the target protein from the 
controllable intervening sequence; and 
(d) recovering the target protein. 

41. The method of claim 39, wherein the substrate is an antibody 
against the controllable intervening protein sequence. 

42. The modified protein of claim 9, wherein for cleavage, the splice 
junction comprises at the C-terminal side of the splice junction 
amino acid residues not having -OH or -SH side chains for C- 
terminal cleavage and having amino acid residues having -OH or 
-SH side chains only at the upstream splice junction for N-terminal 
cleavage. 



43. A method of producing a protein comprising: 

(a) producing a first modified protein comprising a target protein 
into which is inserted at its carboxy terminus an amino- 
terminal portion of a controllable intervening protein 
sequence; 

(b) producing a second modified protein comprising the 
remaining portion of the controllable intervening protein 
sequence; and 

(c) placing the first and second modified proteins under 
predetermined conditions appropriate for cleavage of the 
controllable intervening protein sequence in trans. 



44. A method of producing a protein comprising: 

(a) producing a first modified protein comprising a target protein 
into which is inserted at its amino terminus a carboxy- 
terminal portion of a controllable intervening protein 
sequence; 

(b) producing a second modified protein comprising the 
remaining portion of the controllable intervening protein 

sequence; and 
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(c) placing the first and second modified proteins under 

predetermined conditions appropriate for cleavage of the 
controllable intervening protein sequence in trans. 

45. A method for purification of a target protein comprising: 

(a) forming a fusion protein comprising a portion of a 
controllable intervening protein sequence between a target 
protein and a binding protein having affinity for a substrate; 

(b) contacting the fusion protein with substrate to which the 
binding protein binds; 

(c) combining the remaining portion of the controllable 
intervening protein sequence with the fusion protein; 

(d) subjecting the combined substrate bound fusion protein of 
step (c) to conditions under which cleavage of the 
controllable intervening protein sequence occurs, thus 
separating the target protein from the binding protein; and 

(e) recovering the target protein. 

46. The method of claim 45, wherein the remaining portion of the 
controllable intervening protein sequence has an affinity tag. 

47 A method for purification of a target protein comprising: 

(a) forming a fusion protein comprising a portion of a 
controllable intervening protein sequence between a target 
protein and a binding protein having affinity for a substrate; 

(b) contacting the fusion protein with substrate to which the 
binding protein binds; 

(c) recovering the fusion protein of step (b); 

(d) combining the remaining portion of the controllable 
intervening protein sequence with the fusion protein; 

(e) subjecting the substrate bound fusion protein to conditions 
unde; which cleavage of the controllable intervening protein 
sequence occurs, thus separating the target protein from the 
binding protein; and 

(f) recovering the target protein. 
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48. The method of claim 47, wherein the remaining portion of the 
controllable intervening protein sequence has an affinity tag. 

49. The method of claim 23, 33, 34, 36, 37, 40, 43, 44, 45, and 47, 
wherein the controllable intervening protein sequence is derived 
from Saccharomyces. 
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